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PLANTS HAVING MODIFIED RESPONSE TO FTHVT.FMF 



This is a continuation-in-part of application Serial 
No, 08/086,555 filed July 1, 1993. 

The U.S. Government has certain rights in this 
invention pursuant to Department of Energy Contract No. 
DE-FG03-88ER13873 • 



Technic al Field of the Invention 

The invention generally relates to modified ETR nucle 
acid and plants transformed with such nucleic ac 
which have a phenotype characterized by a modificati 
in the normal response to ethylene. 



Background of the Invention 

0 Ethylene has been recognized as a plant hormone since 
the turn of the century when its effect on pea seedling 
development was first described. Neljubow (1901) , 
Pflanzen Beih. Bot. Zentralb. 10:128-139. Since then, 
numerous reports have appeared which demonstrate that 

5 ethylene is an endogenous regulator of growth and 
development in higher plants. For example, ethylene 
has been implicated in seed dormancy, seedling growth, 
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flower initiation, leaf abscission, senescence and 
fruit ripening. Ethylene is a plant hormone whose 
biosynthesis is induced by environmental stress such as 
oxygen deficiency, wounding, pathogen invasion and 
5 flooding. 

Recently, genes encoding some of the enzymes involved 
in ethylene biosynthesis have been cloned. Sato, at 
al. (1989) Proc. Natl. Acad. Sci . U.S.A. 66:6621-6625; 
Nakajima, et al . (1990) Plant Cell Phys . Physiol. 

10 29:989-996; Van Der Straeten, et al . (1990) Proc . Natl. 
Acad. Sci U.S.A. 87:4859-4963; Hamilton, et al . (1991) 
Proc. Natl. Acad. Sci. U.S.A. 88:7434-7437; and Spanu, 
et al. (1991) EMBO J. 10:2007-2013. The pathway for 
ethylene biosynthesis is shown in Fig* 1. As can be 

15 seen the amino acid methionine is converted to S- 
adenosyl-methionine (SAM) by SAM synthetase which in 
turn is converted to l-aminocyclopropane-l-carboxylic 
acid (ACC) by ACC synthase. Adams, et al • (1979) Proc. 
Natl. Acad. Sci. U.S.A. 76: 170-174. The ACC is then 

20 converted to ethylene by way of the enzyme ACC oxidase. 
Yang, et al . (1984) Annu . Rev. Plant. Physiol. 35:155- 
18^. 

A number of approaches have been taken in an attempt to 
control ethylene biosynthesis to thereby control fruit 

25 ripening. Oeller, et al. (1991) Science 254:437-439 
report that expression of an antisense RNA to ACC 
synthase inhibits fruit ripening in tomato plants. 
Hamilton, et al. (1990) Natujre 346:284-287 report the 
use of an antisense TOM13 (ACC oxidase) gene in 

30 transgenic plants. Picton et al . (1993) Plant Journal 
3:469-481, report altered fruit ripening and leaf 
senesence in tomatoes expressing an antisense ethylene- 
forming enzyme. 



In a second approach, ethylene biosynthesis was 
reportedly modulated by expressing an ACC deaminase in 
plant tissue to lower the level of ACC available for 
conversion to ethylene. See PCT publication No. 
W092/12249 published July 23, 1992, and Klee et ai . 
(1991) Plant Cell 3:1187-1193, 

While a substantial amount of information has been 
gathered regarding the biosynthesis of ethylene, very 
little is known about how ethylene controls plant 
development. Although several reports indicate that a 
high- affinity binding site for ethylene is present in 
plant tissues, such receptors have not been identified. 
Jerie, et al . (1979) Planta 144:503; Sisler (1979) 
Plant Physiol. 54:538; Sisler, et al . (1990) Plant 
Growth Reg. 9:157-164, and Sisler (1990> "Ethylene- 
Binding Component in Plants", The Plant Hormone 
Ethylene, A.K. Mattoo and J.C. Suttle, eds. (Boston) 
C.R.C. Press, Inc., pp. 81-90. In AraJbidopsis , several 
categories of mutants have been reported. In the first 
two categories, mutants were reported which produce 
excess ethylene or reduced ethylene as compared to the 
wild-type. Guzman, et al . (1990) The Plant Cell 2:513- 
523. In a third category, mutants failed to respond to 
ethylene. Jd.; Bleecker, et al . (1988) Science 
241:1086-1089, Harpham, et ai. (1991) Ann. of Botany 
68:55-61. The observed insensitivity to ethylene was 
described as being either a dominant or recessive 
mutation. Jd. 

Based upon the foregoing, it is clear that the genetic 
basis and molecular mechanism of ethylene interaction 
with plants has not been clearly delineated. Given the 
wide range of functions regulated by ethylene and the 
pr vious attempts to control ethyl ne function by 
regulating its synthesis, it would be desirable to have 
an alternate approach to modulate growth and 
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development in various plant tissues such as fruits, 
vegetables and flowers by altering the interaction of 
ethylene with plant tissue. 

Accordingly, it is an object of the invention to 
5 provide isolated nucleic acids comprising an ethylene 
response (ETR) nucleic acid. 

In addition, it is an object to provide modifications 
to such ETR nucleic acids to substitute, insert and/or 
delete one or more nucleotides so as to substitute, 
10 insert and/or delete one or more amino acid residues in 
the protein encoded by the ETR nucleic acid. 

Still further, it is an object to provide plant cells 
transformed with one or more modified ETR nucleic 
acids. Such transformed plant cells can be used to 
15 produce transformed plants wherein the phenotype vis-a- 
vis the response of one or more tissues of the plant to 
ethylene is modulated. 



Summarv of the Inve ntion 

In accordance with the foregoing objects, the invention 
20 includes transformed plants having at least one cell 
transformed with a modified ETR nucleic acid. Such 
plants have a phenotype characterized by a decrease in 
the response of at least one transformed plant cell to 
ethylene as compared to a plant not containing the 
25 transformed plant cell. 

The invention also includes vectors capable of 
transforming a plant cell to alter the response to 
ethylene. In one embodiment, the vector comprises a 
modified ETR nucleic acid which causes a decrease in 
30 cellular response to ethylene. Tissue and/or temporal 
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specificity for expression of the modified ETR nucleic 
acid is controlled by selecting appropriate expression 
regulation sequences to target the location and/or time 
of expression of the transformed nucleic acid. 

5 The invention also includes methods for producing 
plants having a phenotype characterized by a decrease 
in the response of at least one transformed plant cell 
to ethylene as compared to a wild-type plant not 
containing such a transformed cell. The method 
10 comprises transforming at least one plant cell with a 
modified ETR nucleic acid, regenerating plants from one 
or more of the transformed plant cells and selecting at 
least one plant having the desired phenotype. 

Brief Description of the Drawings 

15 Figure 1 depicts the biosynthetic pathway for ethylene. 

Figures 2A, 2B and 2C depict the genomic nucleic acid 
sequence (SEQ ID NO:l) for the ETR gene from 
Arabidopsis thaliana. 

Figures 3A, 3B, 3C and 3D depict the cDNA nucleic acid 
20 (SEQ ID NO: 2) and deduced amino acid sequence (SEQ ID 
NO: 3) for the ETR gene from Arabidopsis thaliana. 

Figures 4A, 4B, 4C and 4D through Figures 7A, 7B, 7C 
and 7D depict the cDNA and deduced amino acid sequence 
for four mutant ETR genes from Arabidopsis thaliana 

25 which confer ethylene insensitivity . Each sequence 
differs from the wild type sequence set forth in Fig. 
3 by substitution of one amino acid residue. The etrl- 
3 (formerly einl-1) mutation in Fig. 4 (SEQ ID NOs:8 
and 9 ) comprises the substitution of alanine-3 1 with 

3 0 valine. The etrl-4 mutation in Fig. 5 (SEQ ID NOs:lO 
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and 11) comprises the substitution of isoleucine-62 
with phenylalanine. The etrl-1 (formerly etr) mutation 
in Fig. 6 (SEQ ID N0s:4 and 5) comprises the 
substitution of cysteine-65 with tyrosine. The etrl-2 
5 mutation in Fig. 7 (SEQ ID N0s:6 and 7) comprises the 
substitution of alanine-102 with threonine. 

Figure 8 depicts the structure of the cosmid insert 
used to localize the ETRl gene from Arstbidopsis 
thaliana. The starting position for the chromosome 

10 walk is indicated by a hatched bar. The open bars give 
the location and length of DNA segments used as probes 
to detect recombination break points. The maximum 
number of break points detected by each probe is shown. 
The numbers to the right of the ETRl gene are out of 74 

15 F2 recombinants between etrl-2 and ap-I, and those to 
the left of the ETR-l gene are out of 2 5 F2 
recombinants between etrl-1 and clv2 . Overlapping YAC 
clones EG4E4 and EG2G11 are also shown. 

Figures 9A and 9B depict the amino acid sequence 

20 alignments of the predicted ETRl protein and the 
conserved domains of several bacterial histidine 
kinases and response regulators. Amino acids are shown 
in boldface type at positions where there are at least 
two identities with ETRl. In Fig. 9A, the deduced ETRl 

25 amino acid sequence (SEQ ID NOs:12 and 27) (residues 
326 to 562) aligned with the histidine kinase domains 
of E. coll BarA (SEQ ID N0s:13 and 28), P. syringae 
LemA (SEQ ID NOs:14 and 29) and X. campestris/RpfC{SEQ 
ID NOs:15 and 30). Boxes surround the five conserved 

30 motifs characteristic of the bacterial histidine kinase 
domain as compiled by Parkinson and Kofoid (Parkinson 
et al. (1992) Annu. Rev. Genet. 26:71). The conserved 
histidine residue that is the supposed site of 
autophosphorylation is indicated by an asterisk. 

3 5 Numbers and positions oif amino acids not shown are 
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15 



given in parentheses. In Fig. 9B, the deduced ETRl 
amino acid sequence (residues 610 to 729) (SEQ ID 
NOs:l5 and 31) are aligned with the response regulator 
domains of B. parapertussis BvgS (SEQ ID NOs:l7 and 
32), P. syringae LemA (SEQ ID NOs:l9 and 34) and E. 
coli Rscc (SEQ ID NOs:18 and 33). Amino acids are 
shown in boldface type where there are at least two 
identities with ETRl. Boxes surround the four highly 
conserved residues in bacterial response regulators. 
The conserved aspartate residue that is the site of 
phosphorylation is indicated by an asterisk. Numbers 
and positions of amino acids not shown are given in 

parentheses. For alignment purposes, a gap { ) was 

introduced in the ETRl sequence. 



Figures lOA and lOB depict specific DNA sequences for 
ETR nucleic acids from tomato and Arabidopsis thaliana. 
Figure lOA compares the DNA sequence encoding amino 
acid residues 1 through 123 (SEQ ID NOs:20 and 21). 
Figure lOB compares the ETR nucleic acid sequence 
20 encoding amino acids 306 through 403 (SEQ ID NOs:22 and 
23). The vertical lines in each figure identify 
homologous nucleotides. 

Figures iiA and llB compare partial amino acid 
sequences (using single letter designation) for an ETR 

25 protein from tomato and Arabidopsis thaliana. Figure 
llA compares the amino acid sequence for the ETR 
protein for amino acids 1 through 123 (SEQ ID NOs:24 
and 25) . Figure liB compares the amino acid sequence 
for the ETR protein for residues 306 through 403 (SEQ 

30 ID NOs:26 and 27). The vertical lines indicate exact 
sequence homology. Two vertical dots indicate that the 
amino acid residues are functionally conserved. One 
dot indicates weak functional conservation as between 
amino acid residues. 
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Figures 12A, 12B, 12C and 12D depict the genomic 
nucleic acid sequence (SEQ ID NO: 45) and deduced amino 
acid sequence (SEQ ID NO: 46) for the QITR ETR gene from 
Arabidopsis thaliana. 

5 Figure 13 depicts the cDNA nucleic acid sequence and 
deduced protein sequence for the QITR -ETR gene from 
Arabidopsis thaliana. 

Figure 14 depicts the genomic nucelic acid sequence 
(SEQ ID NO: 41) and deduced amino acid sequence (SEQ ID 
10 NO: 42) for the Q8 ETR gene from Arajbidopsis thaliana. 

Figure 15 depicts the cDNA nucleic acid sequence (SEQ 
ID NO: 43) and deduced amino acid sequence (SEQ ID 
NO:44) for the Q8 ETR gene from Arabidopsis thaliana. 

Figure 16 depicts the nucleic acid sequence (SEQ ID 
15 NO: 35) and deduced amino acid sequence (SEQ ID NO: 36) 
for the TETR nucleic acid from tomato. 

Figure 17 is a comparison of the amino terminal 
portions of the TETR and ETRl proteins from tomato and 
Arabidopsis respectively. The top line is the TETR 

20 sequence and extends through amino acid residue 315. 
The lower line represents the ETRl protein sequence and 
extends through amino acid residue 316. The vertical 
lines and single and double vertical dots have the same 
meaning as set forth in the description of Figures llA 

25 and IIB. The percent identity between these sequence 
portions is 73.33%. The percent similarity is 84.76%. 

Figure 18 depicts the nucleic acid (SEQ ID NO: 37) and 
deduced amino acid sequence (SEQ ID NO: 38) for the 
TGETRl ETR nucleic acid from tomato. 
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Figure 19 depicts the nucleic acid (SEQ ID NO: 39) and 
deduced amino acid sequence (SEQ ID NO: 4 0) for a 
partial sequence of the TGETR2 ETR nucleic acid from 
tomato. 



10 



15 



20 



Figure 2 0 is a comparison of the amino terminal 
portions for the TGETRl and ETRl proteins from tomato 
and Arabidopsis respectively. The top line is the 
TGETRl sequence through amino acid residue 316. The 
bottom line represents the ETRl protein sequence 
through amino acid residue 316. The identity as 
between these two sequences is 91.75%. The percent 
similarity is 95.87%. The vertical lines and single 
and double dots have the same meaning as for Figures 
llA and IIB. 

Figure 21 is a comparison of an amino terminal portion 
of the TGETR2 protein with the corresponding ETRl 
sequence. The top line is the TGETR2 sequence from 
amino acid residue 11 through amino acid residue 245. 
The lower line is the ETRl sequence from amino acid 
residue l through amino acid residue 235. The sequence 
identity is 85.11% as between these two sequences. The 
sequence similarity is 92.34%. The vertical lines and 
single and double dots have the same meaning as for 
Figures llA and llB. 



15 Figure 22 depicts the nucleic acid (SEQ ID NO: 50) and 
deduced amino acid sequence (SEQ ID NO: 51) for the Nr 
(Never-ripe) ETR nucleic acid from Never-ripe tomato. 
The amino acid sequence in Figure 22 differs from the 
TETR sequence in Figure 16 in that the amino acid 

0 residue proline at residue 36 is replaced with leucine. 
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De-tailed n pam-iption 

The invention provides, in part, plants having cells 
transformed with a vector comprising an ETR nucleic 
acid or a modified ETR nucleic acid. Such transformed 
5 plant cells have a modulated response to ethylene. In 
a preferred embodiment, the expression of a modified 
ETR nucleic acid confers a phenotype on the plant 
characterized by a decrease in the response to ethylene 
for at least for those cells expressing the modified 

10 ETR nucleic acid as compared to a corresponding non- 
transformed plant. Thus, for example, when the 
modified ETR nucleic acid is expressed in fruit such as 
tomato, the fruit ripening process is retarded thereby 
reducing spoilage and extending the shelf life and/or 

15 harvesting season for the fruit. The invention is 
similarly useful to prevent spoilage of vegetative 
tissue and to enhance the longevity of cut flowers. 

As used herein, a "plant ETR nucleic acid" refers to 
nucleic acid encoding all or part of a "plant ETR 
20 protein". ETR nucleic acids can initially be 
identified by homology to the ETR nucleic acid 
sequences disclosed herein but can also be identified 
by homology to any identified ETR nucleic acid or amino 
acid sequence. Examples of ETR nucleic acids include 

25 ETRl, QITR and Q8 from Arabidopsis and TETR, TGETRl and 
TGETR2 from tomato. ETR nucleic acids, however, are 
also defined functionally by their ability to confer a 
modulated ethylene response upon transformation into 
plant tissue. For example, an antisense construct of 

30 an ETR nucleic acid or modified ETR nucleic acid is 
capable of reducing the ethylene response in plant 
tissue expressing the antisense or modified ETR nucleic 
acid. In addition, transformation with an ETR nucleic 
acid or modifi d ETR nucleic acid can result in co- 

35 suppression of the endogenous ETR alleles which in turn 
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modifies the ethylene response. Furthermore, ETR 
nucleic acids can be modified as described herein to 
produce modified ETR nucleic acids which when used to 
transform plant tissue result in varying degrees of 
5 ethylene insensitivity in the tissue expressing such 
modified ETR nucleic acids. When evaluating a putative 
ETR nucleic acid for the ability of a modified form of 
the ETR nucleic acid to confer ethylene insensitivity, 
it is preferred that a codon or combination of codons 
10 encoding the amino acid residues equivalent to Ala-31, 
Ile-62, Cys-65 or Tyr-102 in the ETRl protein of 
Arabidopsis thaliana or Pro-36 in the TETR protein in 
tomato be modified so as to substitute a different 
amino acid residue such as those disclosed herein for 
the specified residues. 



15 



20 



Plant ETR nucleic acids include genomic DNA, cDNA and 
oligonucleotides including sense and anti-sense nucleic 
acids as well as RNA transcripts thereof. The genomic 
DNA sequence (SEQ ID NO:l) for the ETRl gene from 
Arabidopsis thaliana is shown in Figure 2. The 
corresponding cDNA sequence (SEQ ID NO: 2) and deduced 
ETR amino acid sequence (SEQ ID NO: 3) are shown in 
Figure 3. An amino terminal domain (i.e., resides i 
through about 316) of the predicted ETR protein 
25 sequence has no homology to known protein sequences. 
Approximately midway in the ETR protein (i.e., residues 
295 through 313) is a putative transmembrane domain 
followed by a putative intracellular domain (i.e., 
residues 314 through 738) . a substantial portion of 
30 this putative intracellular domain unexpectedly has 
sequence homology to the two component environmental 
sensor-regulators known in bacteria. These two 
families in bacteria form a conserved sensor-regulator 
system that allows the bacteria to respond to a broad 
35 range of environmental fluctuations. it is believed 
that the amino terminal portion of the ETR protein 
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interacts either directly with ethylene or indirectly 
(e.g., with an ethylene binding protein or another 
protein) and that upon such interaction, signal 
transduction through the intracellular domain occurs. 

5 An ETR nucleic acid or ETR protein can be identified by 
substantial nucleic acid and/or amino acid sequence 
homology to a known ETR sequence. Such homology can be 
based upon the overall nucleic acid or amino acid 
sequence in which case the overall homology of the 
10 protein sequence is preferably greater than about 50%, 
preferably greater than 60%, still more preferably 
greater than 75% and most preferably greater than 90% 
homologous. Notwithstanding overall sequence homology, 
it is preferred that the unique amino-terminal portion 
15 of an ETR protein sequence or the nucleic acid sequence 
encoding this portion of the molecule (i.e., the 5' 
terminal portion) be used to identify an ETR protein or 
ETR nucleic acid. When using this amino terminal 
sequence portion, it is preferred that the amino acid 
20 sequence homology with the known ETR sequence be 
greater than about 55%, more preferably about 60%, 
still more preferably about 70%, more preferably 
greater than 85% and most preferably greater than 95% 
homologous. Homology based on nucleic acid sequence is 
25 commensurate with amino acid homology but takes into 
account the degeneracy in the genetic code and codon 
bias in different plants. Accordingly, the nucleic 
acid sequence homology may be substantially lower than 
that based on protein sequence. Thus, an ETR protein 
30 is any protein which has an amino-terminal portion 
which is substantially homologous to the amino-terminal 
domain of a known ETR protein. One such known ETR 
protein is the ETRl protein (see Fig. 3) from 
AraJbidopsis thalxana. An ETR nucleic acid by analogy 
35 also encodes at least the amino-terminal domain of an 
ETR protein. 
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An ETR nucleic acid from a plant species other than 
Arabidopsis thallana can be readily identified by 
standard methods utilizing known ETR nucleic acid. For 
^ example, labelled probes corresponding to a known ETR 

5 nucleic acid or encoding the unique amino-terminal 
domain can be used for in situ hybridization to detect 
the presence of an ETR gene in a particular plant 
species. in addftion, such probes can be used to 
screen genomic or cDNA libraries of a different plant 
10 species or to identify one or more bands containing all 
or part of an ETR gene by hybridization to an 
electrophoretically separated preparation of genomic 
DNA digested with one or more restriction endo- 
nucleases. 



15 The hybridization conditions will vary depending upon 
the probe used. When a unique nucleotide sequence of 
an ETR nucleic acid is used, e.g., an oligonucleotide- 
encoding all or part of the amino terminal domain, 
relatively high stringency, e.g., about O.lxSSPE at 

20 65 »C is used. When the hybridization probe covers a 
region which has a potentially lower sequence homology 
to known ETR nucleic acids, e.g., a region covering a 
portion of the unique amino terminal domain and a 
portion covering a transmembrane domain, the 
25 hybridization is preferably carried out under moderate 
stringency conditions, e.g., about SxSSPE at 50 'C. 

For example, using the above criteria, a ripening 
tomato CDNA library (Stratagene, LaJolla, California, 
Catalog No. 936004) was screened with a labeled probe 

30 comprising a nucleic acid sequence encoding an amino 
terminal portion of the Arabidopsis ETR protein 
sequence disclosed herein in Figure 3A, B, c and D. 
Several clones were identified and sequ need by 
standard techniques. The DNA sequences for this ETR 

35 nucleic acid from tomato (TETR) and Arabidopsis 
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thaliana {ETRl) encoding amino acid residues 1 through 
123 (SEQ ID N0s:20 and 21) and amino acids 306 through 
403 {SEQ ID NOs:22 and 23) are set forth in Figures lOA 
and lOB, respectively. 

5 The amino acid sequences for the ETRl protein from 
Arabidopsis thaliana and tomato (TETR) for residues 1 
through 123 (SEQ ID NOs:25 and 24) and 306 through 403 
(SEQ ID NOs:27 and 26) are set forth in Figures llA and 
IIB, respectively. 

10 The complete ETR nucleic acid (SEQ ID NO: 35) and amino 
acid sequence (SEQ ID NO: 3 6) for TETR is shown in Fig. 
16. A direct comparison of the amino acid sequence 
between the TETR and ETRl proteins for the amino 
terminal 316 amino acid residues is shown in Fig. 17. 

15 As can be seen, there is substantial homology between 
these particular AraJbidopsis and tomato ETR sequences 
both on the level of DNA sequence and amino acid 
sequence. In particular; the homology on the DNA level 
for the sequence encoding amino acids 1 through 45 is 

20 slightly greater than 72%. The homology on the amino 
acid level for amino acid residues 1 through 123 is 
approximately 79%. For the amino terminal portion 
(residues 1 through 316) the overall homology is 
approximately 73%. In the case of amino acid sequence 

25 homology, when the differences between the amino acids 
at equivalent residues are compared and such 
differences comprise the substitution of a conserved 
residue, i.e., amino acid residues which are 
functionally equivalent, the amino acid sequence 
30 similarity rises to about 90% for the first 123 
residues. The sequence antibody for the amino terminal 
316 amino acids rises to almost 85%. Such sequence 
similarity was determined using a Best Fit sequence 
program as describ d by Devereux et al. (1984) NucI . 
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Acids R6S. 22:387-395. Functionally equivalent (i.e., 
conserved) residues are identified by double and single 
data in the comparative sequences. Similarly, the 
nucleic acid sequence homology between Arabidopsis and 
tomato for the sequence encoding amino acid residues 
3 06 to 4 03 is approximately 75%. The sequence homology 
on the amino acid level for identical amino acids is 
almost 86% whereas the similarity is almost 96%. 

In addition to ETRl from Arabidopsis and TETR 
(sometimes referred to TXTR) from tomato, a number of 
other ETR nucleic acids have been identified in 
AraMdopsis and tomato. In Arabidopsis, the QITR and 
Q8 ETR nucleic acids and proteins have been identified. 
See Figs. 12, 13, 14 and 15 and Seq. ID Nos. 41 through 
48. For QITR, the overall nucleic acid homology with 
ETRl is approximately 69%. with regard to the amino 
terminal portion between residues 1 and 316, the 
homology is approximately 71% identical for amino acid 
sequence and approximately 72% identical in terms of 
20 nucleic acid sequence. With reqard to Q8, the overall 
sequence homology to ETRl from Arabidopsis is 
approximately 69% for the overall nucleic acid sequence 
as compared to approximately 81% homology for that 
portion of the Q8 encoding the amino terminal 316 amino 
acids. The homology on the amino acid level for the 
amino terminal portion is between Q8 and ETRl is 
approximatley 72%. 

The other ETR nucleic acids identified in tomato 
include TGETRl (SEQ ID NO: 37) and TGETR2 (SEQ ID 

30 NO: 39). the deduced protein sequence for TGETRl (SEQ 
ID NO: 38) and TGETR2 (SEQ ID NO: 4 0) are set forth in 
Figures 18 and 19 respectively. The sequence of TGETR2 
is incomplete. A comparison of the sequence homology 
for the first 316 amino acid residues of the TGETRl 

35 protein and the ETRl protein is shown in Fig. 20. The 
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sequence identity is just under 92%. The sequence 
similarity rises to almost 96% between this portion of 
these two proteins. With regard to TGETR2 , Fig. 21 
sets forth a comparison of the amino terminal portion 
5 of this molecule (through amino acid residue 245) with 
the corresponding portion of the ETRl protein. The 
identity of sequences between these two sequence 
portions is approximately 85%. The sequence similarity 
rises to just above 92%. 

10 The cloning and sequencing of the ETR nucleic acids 
from Arabldopsis is described in the examples herein. 
However, given the extensive disclosure of the 
sequences for these ETR nucleic acids, one skilled in 
the art can readily construct oligonucleotide probes, 

15 perform PGR amplification or utilize other standard 
protocols known to those skilled in the art to isolate 
the disclosed genes as well as other ITR nucleic acids 
having homology thereto from other species. When 
screening the same plant species, relatively moderate 

20 to high stringency conditions can be used for 
hybridization which would vary from between 55 °C to 
65»C in 5XSSPE. When it is desirable to probe for 
lower homology or in other plant species, lower 
stringency conditions such as 50 °C at 5XSSPE can be 
25 used. Washing conditions however required 0.2XSSPE. 

The isolation of the TETRl ETR nucleic acid from tomato 
is described in the examples. The isolation of this 
sequence utilized the amino terminal portion of the 
ETRl gene from Arabidopsis. The other tomato ETR 

30 nucleic acids disclosed herein (TGETRl and TGETR2) were 
identified by probing a tomato genomic library with an 
ETRl probe. The genomic library was made from EMBL 3 
to which was ligated a partially Sau3A digested genomic 
DNA extract of tomato. Conditions were eS^C 5XSSC with 

35 washes at 2XSSC. 
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In reviewing the overall structure of the various ETR 
nucleic acids and proteins identified to date, it 
appears that at least one class of ETR protein contains 
a unique amino terminal portion followed by a histine- 
5 kinase domain followed by a response regulatory region. 
This is the ETRl protein in Arabidopsis. A second 
class of ETR protein does not contain the response 
regulatory region. Examples of such ETR proteins 
include QITR in Arabidopsis and TETR in tomato. The 
10 significance of this is not understood at this time. 
However, as described hereinafter, mutations in the ETR 
nucleic acids encoding members from each class can 
confer a dominate ethylene insensitivity to transgenic 
plants containing such nucleic acids. 

15 As described hereinafter, substitution of amino acid 
residues Ala-31, Ile-62, Cys-65 and Tyr-102 with a 
different amino acid results in modified Arabidopsis ' 
ETR nucleic acid which are capable of conferring 
ethylene insensitivity in a transformed plant. Each of 

2 0 these residues are identical as between the ETR protein 
of tomato (TETR) and Arabidopsis thaliana (ETRl) . 

Once the ETR nucleic acid is identified, it can be 
cloned and, if necessary, its constituent parts 
recombined to form the entire ETR nucleic acid. Once 

25 isolated from its natural source, e.g., contained 
within a plasmid or other vector or excised therefrom 
as a linear nucleic acid segment, the ETR nucleic acid 
can be further used as a probe to identify and isolate 
other ETR nucleic acids. It can also be used as a 

30 "precursor" nucleic acid to make modified ETR nucleic 
acids and proteins. 

As used herein, the term "modified ETR nucleic acid" 
refers to an ETR nucleic acid containing the 
substitution, insertion or deletion of one or more 



wo 95/01439 



PCT/US94/07418 



-18- 

nucleotides of a precursor ETR nucleic acid. The 
precursor ETR nucleic acids include naturally-occurring 
ETR nucleic acids as well as other modified ETR nucleic 
acids. The naturally-occurring ETR nucleic acid from 
5 AraJbidopsis thstllana can be used as a precursor nucleic 
acid which can be modified by standard techniques, such 
as site-directed mutagenesis, cassette mutagenesis and 
the like, to substitute one or more nucleotides at a 
codon such as that which encodes alanine at residue 31 
10 in the Arabidopsis ETR nucleic acid. Such in vitro 
codon modification can result in the generation of a 
codon at position 31 which encodes any one of the other 
naturally occurring amino acid residues. Such 
modification results in a modified ETR nucleic acid. 

15 For example, the mutation responsible for the pheno- 
type observed in the Never-ripe mutant is disclosed in 
the examples. As described, a single point mutation 
changes the proline normally present at residue 36 in 
the TETR protein to leucine. This single mutation is 

20 sufficient to confer a dominant ethylene insensitivity 
phenotype on the wild-type plant. The transformation 
of tomato and other plants with this modified ETR 
nucleic acid is expected to confer the dominant 
ethylene insensitivity phenotype on such transformed 

25 plant cells. 

Alternatively, the precursor nucleic acid can be one 
wherein one or more of the nucleotides of a wild-type 
ETR nucleic acid have already been modified. Thus, for 
example, the Arabidopsis thaliana ETR nucleic acid can 

30 be modified at codon 31 to form a modified nucleic acid 
containing the substitution of that codon with a codon 
encoding an amino acid other than alanine, e.g., 
valine. This modified ETR nucleic acid can also act as 
a precursor nucleic acid to intr duce a second 

35 modification. For example, the codon encoding Ala-102 
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can be modified to encode the substitution of threonine 
in which case the thus formed modified nucleic acid 
encodes the substitution of two different amino acids 
at residues 31 and 102- 

5 Deletions within the ETR nucleic acid are also 
contemplated. For example, an ETR nucleic acid can be 
modified to delete that portion encoding the putative 
transmembrane or intracellular domains. The thus 
formed modified ETR nucleic acid when expressed within 
10 a plant cell produces only an amino-terminal portion of 
the ETR protein which is potentially capable of binding 
ethylene, either directly or indirectly, to modulate 
the effective level of ethylene in plant tissue. 

In addition, the modified ETR nucleic acid can be 
15 identified and isolated from a mutant plant having a 
dominant or recessive phenotype characterized by an 
altered response to ethylene. Such mutant plants can 
be spontaneously arising or can be induced by well 
known chemical or radiation mutagenesis techniques 
20 followed by the determination of the ethylene response 
in the progeny of such plants. Examples of such mutant 
plants which occur spontaneously include the Never ripe 
mutant of tomato and the ethylene insensitive mutant of 
carnation. Thus, modified ETR nucleic acids can be 
25 obtained by recombinant modification of wild-type ETR 
nucleic acids or by the identification and isolation of 
modified ETR alleles from mutant plant species. 

It is preferred that the modified ETR nucleic acid 
encode the substitution, insertion and/or deletion of 
one or more amino acid residues in the precursor ETR 
protein. Upon expression of the modified nucleic acid 
in host plant cells, the modified ETR protein thus 
produced is capable of modulating at least the host 
cell's response to ethylene. In connection with the 
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generation of such a phenotype, a number of codons have 
been identified in the ETR nucleic acid from 
Arajbidopsis thaliana which when modified and 
reintroduced into a wild-type plant result in a 
5 decrease in the ethylene response by the transformed 
plant- These codons encode amino acid residues Ala-31, 
Ile-62 , Cys-65 and Tyr-102 in the ETR protein of 
Arabidopsis thaliana. The ETR gene and each of these 
particular modified amino acid residues were identified 
10 by cloning the wild-type ETR gene from Arabidopsis 
thaliana and chemically modified alleles from four 
different varieties {etrl-1, etrl'2, etrl-3 and etrl-4) 
of Arajbidopsis thaliana (each of which exhibited a 
dominant phenotype comprising insensitivity to 
15 ethylene) and comparing the nucleotide and deduced 
amino acid sequences. The invention, however, is not 
limited to modified ETR nucleic acids from Arabidopsis 
thaliana as described in the examples. Rather, the 
invention includes other readily identifiable modified 
20 ETR nucleic acids which modulate ethylene sensitivity. 

The above four varieties exhibiting dominant ethylene 
insensitivity were generated by chemical modification 
of seedlings of Arabidopsis thaliana and identified by 
observing plant development from such modified 
25 seedlings with the addition of exogenous ethylene. 
Using a similar approach either with or without the 
addition of exogenous ethylene, the skilled artisan can 
readily generate other variants of any selected plant 
species which also have a modulated response to 
30 ethylene. Then, using ETR probes based upon the wild- 
type or modified ETR nucleic acid sequences disclosed 
herein, other modified ETR nucleic acids can be 
isolated by probing appropriate genomic or cDNA 
libraries of the modified selected plant species. The 
35 nucleotide and/ or encoded amino acid sequence of such 
newly g nerat d modified ETR nucleic acids is then 
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preferably compared with the wild-type ETR nucleic acid 
from the selected plant species to determine which 
modifications, if any, in the ETR nucleic acid are 
responsible for the observed phenotype. If the wild- 
type sequence of the selected plant species is not 
available, the wild-type or modified ETR sequences 
disclosed herein for Arabidopsis thaliana or other ETR 
sequences which have been identified can be used for 
comparison. In this manner, other modifications to ETR 
proteins can be identified which can confer the 
ethylene insensitivity phenotype. Such modifications 
include the identification of amino acids other than 
those disclosed herein which can be substituted at 
residues equivalent to Ala-31 , Ile-62 , Cys-65 and Ala- 
15 102 in the Arabidopsis thaliana ETR protein and the 
identification of other amino acid residues which can 
be modified by substitution, insertion and/or deletion 
of one or more amino acid residues to produce the 
desired phenotype. 

20 Alternatively, a cloned precursor ETR nucleic acid can 
be systematically modified such that it encodes the 
substitution, insertion and/or deletion of one or more 
amino acid residues and tested to determine the effect 
of such modification on a plant's ethylene response. 
25 Such modifications are preferably made within that 
portion of the ETR nucleic acid which encodes the 
amino-terminal portion of the ETR protein. However, 
modifications to the carboxy-terminal or putative 
transmembrane domains to modulate signal transduction 
are also contemplated (e.g. , modifications of the 
conserved histidine of the histidine kinase domain 
which is the supposed site of autophosphorylation or 
the conserved aspartate of the response regulator 
domain which is the supposed site of phosphorylation) . 
35 One method which may be used for identifying particular 
amino acid residues involved in the direct or indirect 
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interaction with ethylene is the sequential 
substitution of the codons of an ETR nucleic acid with 
codons encoding a scanning amino acid such as glycine 
or alanine (See, e.g., PCT Publication W090/04788 
5 published May 3, 1990) followed by transformation of 
each of the thus formed modified nucleic acids into a 
plant to determine the effect of such sequential 
substitution on the ethylene response. Other approaches 
include random modifications or predetermined targeted 
10 modifications of the cloned ETR nucleic (See, e.g., 
PCT Publication No. W092/07090 published April 30, 
1992) followed by transformation of plant cells and the 
identification of progeny having an altered ethylene 
response. The ETR nucleic acid from those plants 
15 having the desired phenotype is isolated and sequenced 
to confirm or identify the modification responsible for 
the observed phenotype. 

Amino acid residues equivalent to those specifically 
identified in an ETR protein which can be modified to 

20 alter the ethylene response can also be readily 
identified in ETR proteins from other plant species. 
For example, equivalent amino acid residues to those 
identified in the ETR protein from Arabidopsis thaliana 
can be readily identified in other ETR proteins. An 

25 amino acid residue in a precursor ETR protein is 
equivalent to a particular residue in the ETR protein 
of Arabidopsis thaliana if it is homologous in position 
in either primary or tertiary structure to the 
specified residue of the Arabidopsis ETR protein. 

30 In order to establish homology by way of primary 
structure, the primary amino acid sequence of a 
precursor ETR protein is directly compared by alignment 
with the primary sequence of the ETR protein from 
Arabidopsis thaliana. Such alignment is preferably of 

35 the amino-terminal domain and will take into account 
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the potential insertion or deletion of one or more 
amino acid residues as between the two sequences so as 
to maximize the amino acid sequence homology. A 
comparison of a multiplicity of ETR protein sequences 
with that of Arabidopsis thaliana provides for the 
identification of conserved residues among such 
sequences which conservation is preferably maintained 
for further comparison of primary amino acid sequence. 
Based on the alignment of such sequences, the skilled 
artisan can readily identify amino acid residues in 
other ETR proteins which are equivalent to Ala-3l, iie- 
62, Cys-65, Ala-102 and other residues in Arabidopsis 
thaliana ETR protein. Such equivalent residues are 
selected for modifications analogous to those of other 
15 modified ETR proteins which confer the desired ethylene 
responsive phenotype. Such modified ETR proteins are 
preferably made by modifying a precursor ETR nucleic 
acid to encode the corresponding substitution, 
insertion and/or deletion at the equivalent amino acid 
2 0 residue. 

In addition to homology at the primary sequence level, 
equivalent residues can be identified based upon 
homology at the level of tertiary structure. The 
determination of equivalency at this level will 
25 generally require three-dimensional crystal structures 
for an ETR protein or modified ETR protein from 
Arabidopsis (or crystal structure of another ETR 
protein having defined equivalent residues) and the 
crystal structure of a selected ETR protein. 
Equivalent residues at the level of tertiary structure 
are defined as those for which the atomic coordinates 
of two or more of the main chain atoms of a particular 
amino acid residue of the selected ETR protein, as 
compared to the ETR protein from Arabidopsis, are 
35 within 0.13 nm and preferably 0. lo nm after alignment. 
Alignment is achieved after the best model has been 
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oriented and positioned to give the maximum overlap of 
atomic coordinates of non-hydrogen protein atoms of the 
ETR proteins in question. 

ETR nucleic acids can be derived from any of the higher 
5 plants which are responsive to ethylene. Particularly 
suitable plants include tomato, banana, kiwi fruit, 
avocado, melon, mango, papaya, apple, peach and other 
climacteric fruit plants. Non-climacteric species from 
which ETR nucleic acids can be isolated include 
10 strawberry, raspberry, blackberry, blueberry, lettuce, 
cabbage, cauliflower, onion, broccoli, brussel sprout, 
cotton, canola, grape, soybean and oil seed rape. In 
addition, ETR nucleic acids can be isolated from 
flowering plants within the Division Magnoliophyta 
15 which comprise the angiosperms which include 
dicotyledons (Class Magnoliopsida and Dicotyledoneae) 
and monocotyledons (Class Liliopsida) . Particularly 
preferred Orders of angiosperm according to "Taxonomy 
of Flowering Plants", by A.M. Johnson, The century Co., 
20 NY, 1931 include RosalBS, Cucurbitales , Rubiales, 
campanulatae, Contortae, Tubiflorae, Plant aginales, 
Ericales, primulales, Ebenales, Diapensiales, 
Primulales, Plumbaginales , Opuntiales, Parietales, 
Myritiflorae, Umbelliflorae, Geraniales, Sapxndales, 
25 Rhamnales, Malvales, Pandales, Rhoendalss, 
Sarraceniales, Ramales, Centrospermae, Santalales, 
Euphorbiales, Capparales, Aristolochiales , Julianiales, 
juglandalBS, Fagales, Urticales, Myricalss, 
Polygonales, Batidales, Balanopsidales, Proteales, 
30 Salicales, Leitneriales, Garryales, Verticillatae and 
Pxperales. Particularly preferred plants include lily, 
carnation, chrysanthemum, petunia, rose, geranium, 
violet, gladioli, orchid, lilac, crabapple, sweetgum, 
maple, poinsettia, locust, ash and linden tree. 
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In addition to providing a source for ETR nucleic acids 
which can be modified or isolated according to the 
teachings herein, the foregoing plants can be used as 
recipients of the modified nucleic acid to produce 
5 chimeric or transgenic plants which exhibit an ethylene 
resistance phenotype in one or more tissue types of the 
transformed plant. 

Once a modified ETR nucleic acid has been cloned, it is 
used to construct vectors for transforming plant cells. 

10 The construction of such vectors is facilitated by the 
use of a shuttle vector which is capable of 
manipulation and selection in both plant and a 
convenient cloning host such as a prokaryote. Such 
shuttle vectors thus can include an antibiotic 

15 resistance gene for selection in plant cells (e.g., 
kanamycin resistance) and an antibiotic resistance gene 
for selection in a bacterial host (e.g. actinomycin 
resistance) . Such shuttle vectors also contain an 
origin of replication appropriate for the prokaryotic 

20 host used and preferably at least one unique 
restriction site or a polylinker containing unique 
restriction sites to facilitate vector construction. 
Examples of such shuttle vectors include pMON53 0 
(Rogers et ai . (1988) Methods in Enzymology 153:253- 

25 277) and pCGN1547 (McBride et al . (1990) Pl^nt 
Molecular Biology 14:269-276). 



In the preferred embodiments, which comprise the best 
mode for practicing the invention, a promoter is used 
to drive expression of an ETR or a modified ETR nucleic 

30 acid within at least a portion of the tissues of a 
transformed plant. Expression of an ETR nucleic acid 
is preferably in the antisense orientation to modulate 
the ethylene response by reduction in translation of 
the endogenous ETR RNA transcript. Expression of a 

35 modified ETR nucleic acid results in the production of 
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a modified ETR protein which is capable of conferring 
ethylene insensitivity . Such promoters may be obtained 
from plants, plant pathogenic bacteria or plant 
viruses. Constitutive promoters include the 3 5S and 
5 19S promoters of cauliflower mosaic virus (CaMV3 5S and 
CaMV19S) , the full-length transcript promoter from the 
Figwort mosaic virus (FMV35S) (See PCT Publication No. 
W092/12249 published July 23, 1992) and promoters 
associated with Agrobacterium genes such as nopaline, 
10 synthase (NOS) , mannopine synthase (MOS) or octopine 
synthase (OCS) . Other constitutive promoters include 
the a-l and j8-l tubulin promoters (Silflow et al . 
(1987) Devel. Genet. 8:435-460), the histone promoters 
(Chaubet (1987) Devi. Genet. 8:461-473) and the 
15 promoters which regulate transcription of ETR nucleic 
acids. 

In some embodiments, tissue and/or temporal-specific 
promoters can be used to control expression of ETR and 
modified ETR nucleic acids. Examples of fruit specific 
20 promoters include the E8 , E4 , E17 and J49 promoters 
from tomato (Lincoln et al . (1988) Mol . Gen. Genet. 
212:71-75) and the 2A11, Z130 and Z70 promoters from 
tomato as described in U.S. Pat. Nos. 4,943,674, 
5,175,095 and 5,177,307. In addition, preferential 

25 expression in rapidly dividing tissue can be obtained 
utilizing the plant EF-lo promoter as described in U.S. 
Pat. No. 5,177,011. Examples of floral specific 
promoters include the leafy promoter and promoters from 
the apetala, pistillata and agamous genes. A promoter 

30 system for targeting expression in the leaves of a 
transformed plant is a chimeric promoter comprising the 
CaMV35S promoter ligated to the portion of the 
ssRUBISCO gene which represses the expression of 
ssRUBISCO in the absence of light. In addition, 

35 pollen-sp cific promoters can also be used. Such 
promoters are well known to those skilled in the art 
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and are readily available. A example of such a 
promoter is Znl3 (Hamilton et ai , (1992) Plant Mol . 
Biol. 18:211-218). This promoter was cloned from corn 
(Monocot) but functions as a strong and pollen-specific 
5 promoter when used in tobacco (Dicot) . 

Examples of inducible promoters which can be used for 
conditional expression of ETR nucleic acids include 
those from heat-shock protein genes such as the PHSl 
heat-shock protein gene (Takahashi et ai . (1989) Mol. 
10 Gen. Genet. 219:365-372) and light-inducible promoters 
including the three chlorophyll a/b light harvesting 
protein promoters (Leutwiler et al . (1986) Nucl . Acids. 
Res. 24:4051-4064) and the pre-f erredoxin promoter 
(Vorst et ai. (1990) Plant Mol. Biol. 14:491-499). 

15 In a further embodiment of the invention, the vector 
used to transform plant cells is constructed to target* 
the insertion of the ETR nucleic acid into an^ 
endogenous promoter within a plant cell. One type of 
vector which can be used to target the integration of' 

20 a modified ETR nucleic acid to an endogenous promoter 
comprises a positive-negative selection vector 
analogous to that set forth by Monsour, et al . Nature 
335:348-352 (1988) which describes the targeting of 
exogenous DNA to a predetermined endogenous locus in 

25 mammalian ES cells. Similar constructs utilizing 
positive and negative selection markers functional in 
plant cells can be readily designed based upon the 
identification of the endogenous plant promoter and the 
sequence surrounding it. When such an approach is 

30 used, it is preferred that a replacement-type vector be 
used to minimize the likelihood of reversion to the 
wild-type genotype. 

The vectors of the invention are designed such that the 
promoter sequence contained in the vector or the 
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promoter sequence targeted in the plant cell genome are 
operably linked to the nucleic acid encoding the ETR or 
modified ETR nucleic acid. When the positive strand of 
the ETR nucleic acid is used, the term "operably 

5 linked" means that the promoter sequence is positioned 
relative to the coding sequence of the ETR nucleic acid 
such that RNA polymerase is capable of initiating 
transcription of the ETR nucleic acid from the promoter 
sequence- In such embodiments it is also preferred to 

10 provide appropriate ribosome binding sites, 
transcription initiation and termination sequences, 
translation initiation and termination sequences and 
polyadenylation sequences to produce a functional RNA 
transcript which can be translated into ETR protein. 

15 When an antisense orientation of the ETR nucleic acid 
is used, all that is required is that the promoter be 
operably linked to transcribe the ETR antisense strand. 
Thus, in such embodiments, only transcription start and 
termination sequences are needed to provide an RNA 

20 transcript capable of hybridizing with the mRNA or 
other RNA transcript from an endogenous ETR gene or 
modified ETR nucleic acid contained within a 
transformed plant cell. In addition to promoters, 
other expression regulation sequences, such as 

25 enhancers, can be added to the vector to facilitate the 
expression of ETR nucleic acid in vivo. 

Once a vector is constructed, the transformation of 
plants can be carried out in accordance with the 
invention by essentially any of the various 

30 transformation methods known to those skilled in the 
art of plant molecular biology. Such methods are 
generally described in Methods and Enzymology, Vol. 153 
("Recombinant DNA Part D") 1987, Wu and Grossman, 
Academic Press, eds. As used herein, the term 

35 "transformation" means the alt ration of the genotype 
of a plant cell by the introduction of exogenous 
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nucleic acid. Particular methods for transformation of 
plant cells include the direct microinjection of the 
nucleic acid into a plant cell by use of micropipettes . 
Alternatively, the nucleic acid can be transferred into 
5 a plant cell by using polyethylene glycol (Paszkowski 
et al. EMBO J. 3:2717-2722 (1984)). Other 
transformation methods include electroporation of 
protoplasts (Fromm, et al . Proc . Natl. Acad. Scl . 
U.S.A. 82:5824 (1985); infection with a plant specific 

10 virus, e.g., cauliflower mosaic virus (Hohn et al . 
"Molecular Biology of Plant Tumors", Academic Press, 
New York (1982), pp. 549-560) or use of transformation 
sequences from plant specific bacteria such as 
AgrobactBrium tumefaciens , e.g., a Ti plasroid 

15 transmitted to a plant cell upon infection by 
agr-oJbacteriujn tumefaciens (Horsch et al . Science 
233:496-498 (1984); Fraley et al . Proc . Natl. Acad. 
Sci. U.S.A. 80:4803 (1983)). Alternatively, plant 
cells can be transformed by introduction of nucleic*^ 

20 acid contained within the matrix or on the surface of 
small beads or particles by way of high velocity 
ballistic penetration of the plant cell (Klein et al . 
Nature 327:70-73 (1987)). 

After the vector is introduced into a plant cell, 
25 selection for successful transformation in typically 
carried out prior to regeneration of a plant. Such 
selection for transformation is not necessary, but 
facilitates the selection of regenerated plants having 
the desired phenotype by reducing wild-type background. 
30 Such selection is conveniently based upon the 
antibiotic resistance and/or herbicide resistance genes 
which may be incorporated into the transformation 
vector . 

Practically all plants can be regenerated from cultured 
35 cells or tissues. As used herein, the term 
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"regeneration" refers to growing a whole plant from a 
plant cell, a group of plant cells or a plant part- 
The methods for plant regeneration are well known to 
those skilled in the art. For example, regeneration 
5 from cultured protoplasts is described by Evans et al • 
"Protoplasts Isolation and Culture", Handbook of Plant 
Cell Cultures 1:124-176 (MacMillan Publishing Co., New 
York (1983); M.R. Davey, "Recent Developments in the 
Culture and Regeneration of Plant Protoplasts", 

10 Protoplasts (1983) Lecture ProcBedings , pp. 12-29 
(Birkhauser, Basil 1983); and H. Binding "Regeneration 
of Plants", Plant Protoplasts, pp. 21-7 3 (CRC Press, 
Bocaraton 1985) . When transformation is of an organ 
part, regeneration can be from the plant callus, 

15 explants, organs or parts. Such methods for 

regeneration are also known to those skilled in the 
art. See, e.g.. Methods in Enzymology, supra.; Methods 
in Enzymology, Vol. 118; and Klee et al . Annual Review 
of Plant Physiology 38:467-486. 

20 A preferred method for transforming and regenerating 
petunia with the vectors of the invention is described 
by Horsch, R.B. et al • (1985) Science 227:1229-1231. 
A preferred method for transforming cotton with the 
vectors of the invention and regenerating plants 

25 therefrom is described by Trolinder et al . (1987) Plant 
Cell Reports 5:231-234. 

Tomato plant cells are preferably transformed utilizing 
AgroJbacteriujn strains by the method as described in 
Mccormick et al. , Plant Cell Reports 5:81-84 (1986). 

30 In particular, cotyledons are obtained from 7-8 day old 
seedlings. The seeds are surface sterilized for 20 
minutes in 3 0% Clorox bleach and germinated in 
Plantcons boxes on Davis germination media. Davis 
germination media is comprised of 4.3 g/1 MS salts, 20 

35 g/1 sucrose and 10 mls/1 Nitsch vitamins, pH 5.8. The 
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Nitsch vitamin solution is comprised of 100 mg/1 myo- 
inositol, 5 mg/1 nicotinic acid, 0.5 mg/1 pyridoxine 
HCl, 0.5 mg/1 thiamine HCl, 0.05 mg/1 folic acid, 0.05 
mg/1 biotin, 2 mg/1 glycine. The seeds are allowed to 
5 germinate for 7-8 days in the growth chamber at 2 5<»C, 
4 0% humidity under cool white lights with an intensity 
of 80 einsteins m^-s-'. The photoperiod is 16 hours of 
light and 8 hours of dark. 

Once germination occurs, the cotyledons are explanted 
10 using a #15 feather blade by cutting away the apical 
meristem and the hypocotyl to create a rectangular 
explant. These cuts at the short ends of the 
germinating cotyledon increase the surface area for 
infection . The explants are bathed in sterile Davis 
15 regeneration liquid to prevent desiccation. Davis 
regeneration media is composed of IX MS salts, 3% 
sucrose, IX Nitsch vitamins, 2.0 mg/1 zeatin, pH 5.8. 
This solution was autoclaved with 0*8% Noble Agar. 

The cotyledons are pre-cultured on "feeder plates" " 

20 composed of media containing no antibiotics. The media 
is composed of 4.3 g/1 MS salts, 3 0 g/1 sucrose, 0.1 
g/1 myo-inositol, 0.2 g/1 KH2PO4, 1.45 mls/1 of a 0.9 ^ 
mg/ml solution of thiamine HCl, 0.2 mis of a 0.5 mg/ml 
solution of kinetin and 0.1 ml of a 0.2 mg/ml solution 

25 of 2,4 D. This solution is adjusted to pH 6.0 with 
KOH. These plates are overlaid with 1.5 - 2.0 mis of 
tobacco suspension cells (TXD's) and a sterile Whitman 
filter soaked in 2C005K media. 2C005K media is 
composed of 4.3 g/1 Gibco MS salt mixture, l ml B5 

30 vitamins (lOOOX stock), 30 g/1 sucrose, 2 mls/1 PCPA 
from 2 mg/ml stock, and 10 /il/l kinetin from 0.5 mg/ml 
stock. The cotyledons were cultured for 1 day in a 
growth chamber at 25**C under cool white lights with a 
light intensity of 40-50 einsteins m^s-' with a 

35 continuous light photoperiod. 
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Cotyledons are then inoculated with a log phase 
solution of Agrobacterium containing the modified or 
wild type ETR nucleic acid. The concentration of the 
Agrobacterium is approximately 5x10*^ cells/ml. The 
5 cotyledons are allowed to soak in the bacterial 
solution for six minutes and are then blotted to remove 
excess solution on sterile Whatman filter disks and 
subsequently replaced to the original feeder plate 
where they are allowed to co-culture for 2 days. After 

10 the two days, cotyledons are transferred to selection 
plates containing Davis regeneration media with 2 mg/1 
zeatin riboside, 500 ^g/ml carbenicillin, and 100 ^q/ml 
kanamycin. After 2-3 weeks, cotyledons with callus 
and/or shoot formation are transferred to fresh Davis 

15 regeneration plates containing carbenicillin and 
kanamycin at the same levels. The experiment is scored 
for transf ormants at this time. The callus tissue is 
subcultured at regular 3 week intervals and any 
abnormal structures are trimmed so that the developing 

20 shoot buds continue to regenerate. Shoots develop 
within 3-4 months. 

Once shoots develop, they are excised cleanly from 
callus tissue and planted on rooting selection plates. 
These plates contain 0,5X MSG containing 50 ^g/ml 

25 kanamycin and 500 ^g/ml carbenicillin- These shoots 
form roots on the selection media within two weeks « If 
no roots appear after 2 weeks, shoots are trimmed and 
replanted on the selection media. Shoot cultures are 
incubated in percivals at a temperature of 22**C. 

30 Shoots with roots are then potted when roots were about 
2 cm in length. The plants are hardened off in a 
growth chamber at 21^C with a photoperiod of 18 hours 
light and 6 hours dark for 2-3 weeks prior to transfer 
to a greenhouse. In the greenhouse, the plants are 

35 grown at a temperature of 26 *C during the day and 21**C 
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during the night. The photoperiod is 13 hours light 
and 11 hours dark and the plants are allowed to mature. 

Once plants have been regenerated, one or more plants 
are selected based upon a change in the ethylene 
5 response phenotype. For example, when a modified ETR 
nucleic acid is used with its native promoter, 
selection can be based upon an alteration in any of one 
of the "triple responses" of seedlings from such 
plants. Guzman et ai . (1990) The Plant Cell 2:523. 
10 Alternatively, or when constitutive promoters are used, 
various other ethylene responses can be assayed and 
compared to the wild type plant. Such other ethylene 
responses include epinasty (which is observed primarily 
in tomato) , epsision, abscission, flower petal 
15 senescence and fruit ripening. In addition to overt 
changes in the ethylene response, the levels of various 
enzymes can be determined followed by exposure to 
ethylene to determine the response time for the typical 
increase or decrease in the level of a particular 
protein such as an enzyme. Examples of various 
ethylene responses which can be used to determine 
whether a particular plant has a decreased response to 
ethylene are set forth in Chapter 7, The Mechanisms of 
Ethylene Action in "Ethylene in Plant Biology" 2d Ed. 
25 F.B. Abels, P.W. Morgan and M.E. Salveit, Jr., eds. , 
San Diego, Academic Press, Inc. (1992). When a tissue 
and/or temporal -specific promoter or inducible promoter 
is used, the determination of a modulation in the 
ethylene response is determined in the appropriate 
tissue at the appropriate time and if necessary under 
the appropriate conditions to activate/ inactivate an 
inducible promoter. In each case, the ethylene 
response is preferably compared to the same ethylene 
response from a wild-type plant. 



20 



30 
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The following are particularly preferred embodiments 
for modulating the ethylene response in fruit- 
However, such embodiments can be readily modified to 
modulate the ethylene response in vegetative tissue and 
5 flowers. 

In one approach, a modified ETR nucleic acid operably 
linked to a constitutive promoter of moderate strength 
is used to reduce the ethylene response. This results 
in a lengthening of the time for fruit ripening. 

10 In an alternate embodiment, a modified ETR nucleic acid 
operably linked to a regulatable (inducible) promoter 
is used so that the condition that turns on the 
expression of the modified ETR nucleic acid can be 
maintained to prevent fruit ripening- The condition 

15 that turns off the expression of the modified ETR 
nucleic acid can then be maintained to obtain ripening. 
For example, a heat- inducible promoter can be used 
which is active in high (field) temperatures, but not 
in low temperatures such as during refrigeration. A 

20 further example utilizes an auxin or gibberellin- 
induced promoter such that transformed plants can be 
treated with commercial auxin analogs such as 2, 4-D or 
with commercial gibberellin analogs such as Pro-Gibb to 
prevent early ripening. 

25 Alternatively, a strong constitutive promoter can be 
operably linked to a modified ETR nucleic acid to 
prevent fruit ripening. So as to allow eventual fruit 
ripening, the plant is also transformed with a wild- 
type ETR nucleic acid operably linked to an inducible 

30 promoter. Expression of the wild-type ETR nucleic acid 
is increased by exposing the plant to the appropriate 
condition to which the inducible promoter responds . 
When the wild-type ETR nucleic acid expression is 
increased, the effect of expression of the modif i d ETR 
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10 



30 



nucleic acid is reduced such that fruit ripening 
occurs . 

Particular constructs which are desirable for use in 
transforming plants to confer ethylene insensitivity 
include the CMV3 5S promoter operably linked to any 
other mutant AraJbidopsis ETR genomic or cDNA clones 
including the corresponding modification at residue 36 
to convert proline to leucine. Such constructs are 
expected to confer a dominant ethylene insensitivity 
phenotype tp cells and plants transformed with and 
expressing such constructs. 



In addition, a preferred construct includes operably 
linking the FMV promoter to drive expression of the 
tomato TETR cDNA which has been engineered to contain 

15 a mutation analogous to any of those identified in the 
ETR genes from Arahidopsis as well as the Nr mutation 
found in the tomato ETR gene. Such constructs are 
expected to confer a dominant ethylene insensitivity 
phenotype to cells and plants which are transformed 

20 with and express such constructs. 

Other preferred constructs include the operable linking 
the FMV promoter to ETR antisense cDNAs including TETR 
and ETRl. Such constructs are expected to confer a 
dominant ethylene insensitivity phenotype to cells and 
25 plants which are transformed with and express such 
constructs. 

The invention can be practiced in a wide variety of 
plants to obtain useful phenotypes. For example, the 
invention can be used to delay or prevent floral 
senescence and abscission during growth or during 
transport or storage as occurs in flower beds or cotton 
crops (Hall, t al . (1957) Physiol. Plant 10:306-317) 
and in ornamental flowers (e.g. , carnations, roses) 
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that are either cut (Halevy, et al . (1981) Hart. Rev. 
3.59-143) or not cut. In addition, the invention can 
be practiced to delay or prevent senescence and 
abscission of leaves and fruits in cucumber (Jackson, 
5 et al. (1972) Can. J. Bot . 50:1465-1471), legumes and 
other crops (Heck, et al . (1962) Texas Agric . Expt, 
Sta . Misc . Publ . MP 613 : 1-13 ) and ornamental plants 
(e.g., holly wreaths) (Curtis et al . (1952) Proc. Am. 
Soc. Hort. Sci. 560:104-108). Other uses include the 
10 reduction or prevention of bitter-tasting phenolic 
compounds ( isocoumarins) which are induced by ethylene 
for example in sweet potatoes (Kitinoja (1978) 
"Manipulation of Ethylene Responses in Horticulture", 
Reid, ed., Acta. Hort. Vol 201, 377-42) carrots (Coxon 
15 et al . (1973) Phyto. Chem. Istry. 22:1881-1885) , 
parsnip (Shattuck et al . (1988) Hort. Sci. 23:912) and 
Brassica. Other uses include the prevention of 
selective damage to reproductive tissues as occurs in 
oats and canola (Reid et ai . (1985) in "Ethylene in 
20 Plant Development", Roberts, Tucker, eds. (London), 
Butterworths , pp. 277-286), the loss of flavor, 
firmness and/or texture as occurs in stored produce 
such as apples and watermelons (Risse et ai . (1982) 
Hort. Sci. 17:946-948), russet spotting (a post-harvest 
25 disorder) which is ethylene induced in crisphead 
lettuce (Hyodo et al . (1978) Plant Physiol. 62:31-35), 
to promote male flower production (Jaiswal et ai . 
(1985) Proc. Indian Acad. Sci. (Plantg Sci. 95:453-459) 
and to increase plant size, e.g., by delaying the 
30 formation of flowers in ornamental bromeliads (Mekers 
et al. (9183) Acta Hortic 137 i 211 -222) . Furthermore, 
a decrease in ethylene response can be used to delay 
disease developments such as the preventing of lesions 
and senescence in cucumbers infected with 
35 Colletotrichum lagenarium and to reduc diseases in 
plants in which ethylene causes an increase in disease 
development, e.g., in barley, citrus, Douglas fir 
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seedlings, grapefruit, plum, rose, carnation, 
strawberry, tobacco, tomato, wheat, watermelon and 
ornamental plants. In addition, the invention can be 
used to reduce the effect of ethylene found in the 
5 environment and indirectly the effect of various 
environmental stresses which result in the biosynthesis 
of ethylene in plant tissue. For example, ethylene 
exists at biologically detrimental levels in localized 
atmospheres due to fires, automobile exhaust and 

10 industry* See, e.g.. Chapter 8, Ethylene in the 
Environment in "Ethylene in Plant Biology", supra. In 
addition, the invention can be used to minimize the 
effect of ethylene synthesized in response to 
environmental stresses such as flooding, drought, 

15 oxygen deficiency, wounding (including pressure and 
bruising), chilling, pathogen invasion (by viruses, 
bacteria, fungi, insects, nematodes and the like), 
chemical exposure (e.g., ozone salt and heavy metal 
ions) and radiation. 

20 The following is presented by way of example and is not 
to be construed as a limitation on the scope of the 
invention. Further, all references referred to herein 
are expressly incorporated by reference. 

EXAMPLE 1 

25 Cloning of the ETRl Gene 

Btrl^l plants were crossed with two lines carrying the 
recessive visible markers apl and clv2 respectively. 
The F, progeny were allowed to self -pollinate. 
Phenotypes were scored in the F2. The recombination 
30 percentages (using the Kosambi mapping function (D.D. 
Kosambi (1944) Ann. Eugen. 12:172)) were determined in 
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centimorgans. The ETRl locus mapped to the lower 
portion of chromosome 1 between the visible genetic 
markers apl and clv2 (6.5 +/-1.0 cM from API and 2.8 
+/-1.1 cM from CLV2) . 

5 etrl-1 was crossed to tester line WlOO (ecotype 
Landsberg (Koornneef et al . ( 1987 ) AraJbidopsis Inf. 
SBrv. 23:46) and the Fj plants were allowed to self- 
pollinate. Linkage of RFLP markers to the ETRl locus 
was analyzed in 56 plants as described in Chang, et 

10 al. (1988) Proc. Natl. Acad. Sci . U.S.A. 85:6856. Of 
the RFLP markers that reside in this region of 
chromosome 1 , one marker , lbAt3 15 , completely 
cosegregated with the etrl-l mutant phenotype out of 
112 chromosomes. The lbAt315 clone was therefore used 

15 as a probe to initiate a chromosome walk in the ETRl 
gene region. Various genomic DNA cosmid libraries were 
utilized. One library contained subclones of two yeast 
artificial chromosomes (YACs EG4E4 and EG2G11 (Grill et 
al • (1991) Mol . Gen. Genet . 226:484)) that hybridized 

20 to IbAtBlS. To subclone the YACs, total DNA from yeast 
cells harboring EG4E4 or EG2G11 was partially digested 
with Sau3AI, and cloned into the Bglll site of cosmid 
vector pCIT30 (Ma et al. (1992) Gene 117:161). 
Standard cloning and screening methods were used 

25 (Sambrook et al , Molecular Cloning: A Laboratory Manual 
(Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 
1989)) • A library from the etrl-l mutant was similarly 
constructed in pCIT30. The wild type library was 
constructed previously (Yanofsky et al. (1990) Nature 

30 345:35). By restriction analysis and sequential 
hybridization to these libraries, overlapping cosmids 
(a contig) were obtained that spanned a distance of 
approximately 230 kb. See Fig. 8. 

The ETRl gen was localized to a subregion of 
35 approximately 47 kb using fine structure RFLP mapping. 
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To create the fine structure map, meiotic recombinants 
were isolated based on phenotype from the F2 self- 
progeny of the above crosses between the etri-I mutant 
(ecotype Columbia) and two lines (both ecotype 
5 Landsberg) carrying apl and clv2. Recombinants were 
identified in the F2 progeny as plants that were either 
wild type at both loci or mutant at both loci. ETRl 
was scored in dark grown seedlings (Bleecker et ai . 
(1988) Science 241:1086) • Seventy-four (74) 

10 recombinants between ETRl and API were obtained, and 25 
recombinants between ETRl and CLV2 . The recombination 
break points were mapped using DNA fragments from the 
chromosome walk as RFLP probes. Given the number of 
recombinants isolated, the calculated average distance 

15 between break points was roughly 20 kb for each cross. 
Over the 230 kb contig, the actual density of break 
points found was consistent with the calculated density 
on the CLV2 side (with 5 break points in approximately 
12 0 kb) . The nearest break points flanking the ETRT 

20 gene defined a DNA segment of approximately 47 kb. 

To search for transcripts derived from this 47 kb 
region, cDNA libraries were screened using DNA 
fragments. One cDNA clone was designated XC4 and was 
detected with the 4.2 5 kb EcoRl fragment 1 shown in 
25 Fig. 8. Because XC4 potentially represented the ETRl 
gene, this clone was further characterized. 
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EXAMPLE 2 

KTR Gene Characterization 

The nucleotide sequences of the XC4 cDNA and the 
corresponding genomic DNA (Figure 2) (SEQ ID NO:l) was 
5 determined using sequenase version 2.0 (United States 
Biochemical Co,, Cleveland, Ohio) and synthetic 
oligonucleotide primers having a length of 17 
nucleotides. The primer sequences were chosen from 
existing ETRl sequences in order to extend the sequence 

10 until the entire sequence was determined. The initial 
sequence was obtained using primers that annealed to 
the cloning vector. Templates were double-stranded 
plasmids. Both strands of the genomic DNA were 
sequenced, including 225 bp upstream of the presumed 

15 transcriptional start site, and 90 bp downstream of the 
polyadenylation site. XC4 was sequenced on a single 
strand . 

XC4 was 1812 base pairs long, including a polyA tail of 
18 bases. From the DNA sequences and RNA blots 
20 (described below) , it was determined that XC4 lacked 
approximately 1000 base pairs of the 5' end. 

To obtain longer cDNAs, first strand cDNA was 
synthesized (RiboClone cDNA Synthesis System, Promega, 
Madison Wisconsin) from seedling polyA+ RNA using 

25 sequence-specific primers internal to XC4. The cDNA 
was then amplified by PCR (Saiki, R.K. et al . (1985) 
Science 230:1350) using various pairs of primers: 
3' PCR primers were chosen to anneal to different 
exons as deduced from the cDNA and genomic DNA 

30 sequences, and 5' PCR primers were chosen to anneal to 
various 5' portions of genomic DNA sequences. Six 
different primers at the 5' end were used. The 
farthest upstream primer which amplified the cDNA was 



wo 95/01439 



PCT/US94/07418 



10 



-41- 

primer Q (5 ' AGTAAGAACGAAGAAGAAGTG) (SEQ ID NO:26). An 
overlapping primer, which was shifted twelve bases 
downstream, also amplified the cDNA. The cDNA could 
not be amplified using a 5' end primer that was 98 base 
pairs farther upstream. Genomic DNA templates were 
used for PGR controls. The longest cDNA was considered 
to extend to the 5' end of primer Q. The amplified 
cDNAs were sequenced directly with Sequenase Version 
2.0 as follows: after concentrating the PGR reactions 
by ethanol precipitation, the amplified products were 
separated by electrophoresis in 0.8% LMP agarose gels. 
The DNA fragments were excised, and a mixture of lo ul 
excised gel (melted at 70»C) , 1 ml 10 mM primer and 1.2 
ml 5% Nonidet P-40 was heated at 90°G for two minutes 
15 to denature the DNA. The mixture was then cooled to 
37 »C prior to proceeding with sequencing reactions. 

The longest cDNA, which was 2786 bases (not including 
the polyA tail) , was consistent with the estimated size 
of 2800 bases from RNA blots, and was presumed to be 

2 0 close to f ul 1 length . A potent ia 1 TATA box ( 5 ' 
ATAATAATAA) lies 3 3 bp upstream of the 5' end in the 
genomic sequence. Based on comparison of the cDNA and 
the genomic DNA sequences, the gene has six introns, 
one of which is in the 5' untranslated leader. The 

25 exons contain a single open reading frame of 738 amino 
acids. See Fig. 3. 

The determination that this gene is, in fact, ETRl was 
established by comparing the nucleotide sequences of 
the wild type allele and the four mutant alleles. For 
each mutant allele, an EcoRl size-selected library was 
constructed in the vector lambda ZAPII (Stratagene, 
LaJolla, California). Clones of the 4.25 kb EcoRl 
fragment were isolated by hybridization with the wild 
type fragment. These clones were converted into 
35 plasmids (pBluescript vector) by in vivo excision 



30 
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according to the supplier (Stratagene) and sequenced. 
Two independent clones were sequenced on a single 
strand for each mutant allele. The 5' ends (535 bp not 
contained on the 4.25 kb EcoRl fragment) were amplified 
5 by PGR and directly sequenced as previously described. 
Codon differences were as follows: Codon 65 TGT to TAT 
in etrl-1 (Figs. 6A, B, C and D) , Codon 102 GCG to ACG 
in etrl-2 (Figs. 7A, B, C and D) , Codon 31 GCG to GTG 
in 6trl'3 (Figs. 4A, B, C and D) , Codon 62 ATC to TTC 
10 in etrl-4 (Figs. 5A, B, C and D) . All four mutations 
are clustered in the amino-terminal region of the 
deduced protein sequence. 

The ETRl message was examined in standard RNA 
electrophoresis (formaldehyde) gel blots. The 2.8 kb 

15 ETRl transcript was present in all plant parts examined 
- leaves, roots, stems, flowers and seedlings (data not 
shown). In addition, no differences were observed 
between ETRl transcripts of the wild type and the 
mutant alleles (data not shown) . Treatment with 

20 ethylene did not detectably alter the amount of ETRl 
mPNA in dark-grown wild type seedlings (data not 
shown) . 

When the ETRl gene was hybridized to Arabidopsis 
genomic DNA blots at normal stringency (i.e., overnight 

25 in 5XSSPE (0.9 M NaCl, 50 mM NaH2P04, 40 mM NaOH, 4.5 mM 
EDTA, pH 7.4 at eS'C, with the most stringent wash in 
O.lxSSPE at eS'C for 30 minutes), only the expected 
fragments of the ETRl locus were observed (data not 
shown). At reduced stringency (i.e., hybridization in 

30 SxSSPE at 50''C and washs in SxSSPE at 50'C.), however, 
numerous fragments were detected, which suggests that 
a family of similar genes exists in Arabidopsis. 

The predict d amino terminal sequenc of ETRl (residu s 
1-316) has no similarity to sequences in the GenBank 
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database (version 77.0) . The carboxy-terminal portion, 
however, is highly similar to the conserved domains of 
both the sensor and the response regulator of the 
prokaryotic two-component system of signal 
5 transduction. In bacteria, the histidine protein 
kinase domain of the sensor is characterized by five 
sequence motifs arranged in a specific order with 
loosely conserved spacing (Parkinson (1992) Annu . i?ev. 
Genet. 26 ill) . The deduced ETRl sequence contains all 
10 five motifs with the same relative order and spacing 
found in the bacterial proteins (Fig. 9A) . The deduced 
sequence is most similar to the sequences of 
Escherichia coli Bar A (Nagasawa at al . (1992) Mol . 
Microbiol. 6:3011) and Pseudomonas syringae LemA 
15 (Harbak et al . (1992) J-, Bact. 174:3011); over the 
entire histidine kinase domain (the. 241 amino acids 
from residues 336 through 566), there are 43% and 41% 
amino acid identities with BarA and LemA respectively, 
and 72% and 71% similarities respectively. The 
20 function of BarA is unknown, although it was cloned 
based on its ability to complement a deletion in the B. 
coli osmotic sensor protein, EnvZ (Nagasawa, supra.). 
LemA is required for pathogenicity of P. syringae on 
bean plants (Hrabak, supra.) . Other bacterial proteins 
25 with sequences highly similar to this putative ETRl 
domain are: Xanthomonas campestris RpfC (3 5% identity) 
which is possibly involved in host recognition for 
pathogenicity in cruciferous plants (Tang et al (1991) 
Mol. Gen. Genet. 226:409), E. coli RcsC (34% identity) 
3 0 which is involved in regulation of capsule synthesis 
(Stout et al. (1990) J. Bacterid. 172:659) and E. coli 
ArcB (25% identity) which is responsible for repression 
of anaerobic enzymes (Luchi et al . (1990) Mol. 
Microbiol. 4:715) . 

35 Adjacent to the putative histidine kinase domain, the 
deduced ETRl sequence exhibits structural 
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characteristics and conserved residues of bacterial 
response regulators. Structural characteristics of 
response regulators are based on the known three- 
dimensional structure of CheY (the response regulator 
5 for chemotaxis) in Salmonella typhimurium and E. coli , 
which consists of five parallel /3-strands surrounded by 
five a-helices (Stock et al . (1989) Nature 33 7:745; 
Volz et al. (1991) J. Biol. Chem. 266:15511). 
Sequences of bacterial response regulators have been 
10 aligned to this structure based on residues that are 
compatible with the hydrophobic core of the CheY (Stock 
et al. (1989) Microbiological Rev. 53:450). The 
deduced ETRl sequence can be similarly aligned (data 
not shown) . At four specific positions, response 
15 regulators contain highly conserved residues - three 
aspartates and a lysine (Parkinson et al . (1992) Annu . 
Rev. Genet. 26:71; Stock et al . , supra.); the three 
aspartates form an acidic pocket into which protrudes 
the side chain of the conserved lysine (Stock et al. 
20 (1989) Nature 337:745; Volz et al . (1991) J. Biol. 
Chem. 266:15511) and the third aspartate is the 
receiver of the phosphate from phosphohistidine (Stock 
et al. (1989), supra.). Except for the conservative 
substitution of glutamate for the second aspartate, 
25 these conserved amino acids are found in the same 
positions in the deduced ETRl sequence (Fig. 9B) . The 
deduced sequence in this domain (a stretch of 121 amino 
acids from residues 609 through 729 in ETRl) is most 
similar to the sequences of Bordetella parapertussis 
30 BvgS (29% identity, 60% similarity) which controls 
virulence-associated genes for pathogenicity in humans 
(Aricd et al. (1991) Mol . Microbiol. 5:2481), JS:. coli 
RcsC (29% identity, 64% similarity) , P. syringae LemA 
(26% identity, 57% similarity) , X. campestris RpfC (25% 
35 identity) and E. coli BarA (20% identity) . All of the 
bacterial proteins that are similar to ETRl in sequence 
are also structurally similar to ETRl in that they 
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contain both the histidine kinase domain and the 
response regulator domain. Although these features are 
shared, the sensing functions are clearly diverged. 

A potential membrane spanning domain (residues 295-313) 
5 exists in the deduced ETRl sequence based on hydropathy- 
analysis (Kyte et al. (1982) J. Mol . Biol. 257:105), 
but it is unclear whether ETRl is actually a 
transmembrane protein since there is no clear signal 
sequence. There are also no N-linked glycosy lation 
10 sites. While all of the bacterial proteins to which 
the deduced ETRl sequence is similar have two potential 
membrane spanning domains flanking the amino terminal 
domain, a few bacterial sensors (those which lack the 
response regulator) do not. 

15 EXAMPLE 3 

An etrl Mutant Gene Confers 
Ethylene Insensitivitv to Wild Type Plants 

Dominant ethylene insensitivity was conferred to wild 
type Arabidopsis plants when the etrl-1 mutant gene was 

20 stably introduced using Agrobacterium-mediated 
transformation. The gene was carried on a 7.3 kb 
genomic DNA fragment (fragments 1 and 2 in Fig. 8 which 
included approximately 2-7 kb upstream of the 
transcription initiation site, and approximately 1 kb 

25 downstream of the polyadeny lation site) . It was cloned 
into binary transformation vector pCGNl547 obtained 
from Calgene, Inc., Davis, California. The vector also 
carried a selectable marker for kanamycin resistance in 
plants . 

30 For the etrl-1 construct, the 4.25 kb EcoRI plasmid 
clone containing the etrl-1 mutation was linearized by 
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partial EcoRl digestion and ligated with the 3.1 kb 
EcoRl fragment which was agarose gel-purified from 
cosmid clone thetaS (a subclone of YAC EG4E4 in the 
walk) . The resulting plasmid, containing the two EcoRl 
5 fragments in the correct relative orientation, was 
linearized at polylinker site AspllB, the ends were 
filled in using Klenow enzyme, and BamHl linkers were 
ligated to the blunt ends. Finally, the 7 . 3 kb insert 
was removed from the plasmid at the polylinker site 

10 BamHl, and ligated into the BamKl site of binary 
transformation vector pCGN1547 (McBride, K.E. et a J • 
(1990) Plant Molecular Biology 14:269). For the 
control construct, the wild type 7.3 kb fragment was 
agarose gel-purified from EcoRJ partially digested 

15 cosmid thetaS, and subcloned into the EcoRl site of 
pBluescript. The fragment was then removed using the 
BamHl and ICpnl sites of the polylinker, and ligated 
into pCGN1547 that had been digested with BajnHI and 
JCpnI. The mutant and wild type constructs were 

20 transformed into Agrohacterium (Holsters et ai . (1978) 
Mol. Gen. Genet. 163:181) strain ASE (Monsanto) (Rogers 
et al . (1988) Meth . Enzymol . 153:253). Arabidopsis 
ecotype Nossen was transformed (Valvekens, D. et al . 
(1988) Natl. Proc. Acad. Sci. U.S.A. 85:5536) using 

25 root-tissue cultured in liquid rather than on v solid 
medium. Triploid plants having one mutant copy of the 
ETRl gene were obtained as the progeny of crosses 
between the etrl^l homozygote (diploid) and a 
tetraploid wild type in ecotype Bensheim which has the 

30 same triple response phenotype as ecotype Columbia. 
Triploid wild type plants were similarly obtained by 
crossing the diploid wild type to the tetraploid. 
Ethylene sensitivity was assayed in dark-grown 
seedlings treated with either ethylene (Bleecker et 

35 al., supra.) or 0.5 mM ACC. For ACC tr atment, plants 
were germinated and grown on Murashige and Skoog basal 
salt mixture (MS, Sigma), pH 5.7, 0.5 mM ACC (Sigma), 
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1% Bacto-agar (Difco) . Kanamycin resistance was 
measured by the extent of root elongation in one week 
old seedlings grown on MS pH 5.7 ng/ml Kanamycin, l% 
Bacto-agar . 

5 Ten kanamycin resistant plants were produced. Eight of 
the ten exhibited ethylene insensitive self-progeny as 
evaluated by the dark-grown seedling response to 
ethylene. m each line, ethylene insensitivity 
cosegregated with kanamycin resistance. As a control, 

10 transformations were performed using the corresponding 
7.3 kb genomic DNA fragment of the wild type from which 
six kanamycin resistant plants were obtained. These 
lines gave rise to only ethylene sensitive self-progeny 
which did not appear to be different from the wild 

15 type. 

The etrl-l transf ormants displayed different levels of 
ethylene insensitivity. Thus, the wild type gene is 
capable of attenuating the mutant phenotype and the 
etrl'l mutation is not fully dominant in the 

2 0 transformed plants. Of the ten kanamycin resistant 
lines, six gave completely dominant ethylene 
insensitivity, indicating the presence of multiple 
copies of the mutant gene. Two other lines displayed 
Partial dominance, and two lines appeared to be wild 

2 5 type. Reduced ethylene insensitivity was presumably 
due to low expression levels which can be caused by 
position effects (e.g., DNA methylation) or possibly by 
truncation of the transferred DNA. 
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FXAMPLE 4 



v^f^nor ConstT-iints con f -ainina Heterologous Promoter 

This example describes the construction of a plant 
transformation vector containing a heterologous 
5 promoter to control expression of wild type and mutant 
ETRl nucleic acids. 

The cauliflower mosaic virus 35S protein promoter 
(Guilley et al . (1982) Cell 30:763-773; Odell, et al . 
(1985) Nature 313:810-812 and Sanders et al . (1987) 

10 Nuci. Acids Res. 15:1543-1558) and the 3' end of the 
Nopaline synthase (NOS) gene were cloned into the 
PCGN1547 vector to create pCGNlS . The 35S promoter, on 
a Hindlll-BamHI fragment of approximately 1.6 kb, was 
cloned into the unique HindJll-BamHl site of pCGN1547. 

15 The 1 kb BamHl-Kpnl NOS fragment was cloned into the 
unique BamHI-Kpnl site of pCGN1547. 

The 4.25 kb EcoRl fragment of both the wild type and 
mutant ETRl-1 allele were independently cloned into the 
unique BamHI site of the above pCGNlS vector using 

20 BamHI linkers. This 4.25 kb EcoRl genomic fragment 
contains the entire coding sequence including five 
introns and approximately 1 kb genomic DNA downstream 
of the polyadenylation site. It does not contain the 
ETRl promoter which is on the 3.1 EcoRl fragment 2 in 

25 Fig. 5. 

These vectors were used to transform root explants as 
described in Example 3. Kanamycin resistant plants 
containing the mutant ETRl-1 gene were obtained and 
demonstrated an ethylene insensitivity phenotype 
30 similar to that found in Example 3. Control plants 
transformed with the wild type ETRl g ne produced only 
ethylene sensitive self -progeny . 
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EXAMPLE 5 

Vector Construct. Utiilizina Antisense ETRl 

Ethylene insensitivity was conferred to wild-type 
Aratoidopsis by expression of an ETRl antisense nucleic 
5 acid which was introduced using standard AgrobactBrium 
root transformation procedure. Valvekens et al • (1988) 
Proc. Natl. Acad. Sci . U.S.A. 85:5536. The antisense 
nucleic acid consisted of a 1.9 kb ETRl cDNA fragment. 
Expression of this fragment, which extended from the 

10 MscI restriction site at nucleotide 220 to the first 
Smal site at nucleotide 2176 in Figs 3A, 3B, 3C and 3D 
was driven in the reverse orientation by the CaMV 3 5S 
promoter. To construct the antisense nucleic acid, 
BamHl linkers were ligated to the ends of the 1.9 kb 

15 MsclSmal DNA fragment and the thus formed fragment was 
ligated into the BamHl site of pCGN 18 transformation 
vector. Jack et al . (1994) Cell 76:703. The construct 
was transformed into Agrobacterium strain ASE as 
described above and then into Arabidopsis . 

2 0 Seedlings derived from this transformation experiment 
were tested for sensitivity to ethylene as previously 
described. Seedlings containing the antisense 

construct were ethylene insensitive. 

EXAMPLE 6 

25 Identification of QITR, 

a Second ETR Nucleic Acid in Arabidopsis 



Genomic DNA from Arabidopsis thaliana was partially 
digested with Sau3A and cloned into a XCEMll (half-site 
arms) obtained from Promega, Madison, Wisconsin. The 
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genomic digest was partial end filled prior to cloning 
with XGEMll and plated on media as suggested by the 
manufacturer. 

The thus cloned library was screened with a ^^P-labeled 
5 cDNA XJbal fragment extending from nucleotides 993-2308 
as set forth in Figures 3B, 3C and 3D. Hybridization 
conditions were 50 ^'C and 5XSSPE. Washes were made at 
50*>C 0.2XSSPE. Several positively hybridizing clones 
were identified, replated and rescreened. Positively 

10 hybridizing clones were digested with SacI (which 
cleaves within the arms of the cloning phage and within 
the insert) . The multiple fragments obtained therefrom 
were subcloned into bacterial plasmids for sequencing. 
The genomic DNA sequence (SEQ ID NO.: 45) together with 

15 the deduced amino acid sequence (SEQ ID NO.:46 and 48) 
is set forth in Figure 12. This ETR nucleic acid and 
amino acid sequence is referred to as the QITR nucleic 
or amino acid sequence respectively. The QITR cDNA 
sequence (SEQ ID NO.: 47) and the QITR amino acid 

20 sequence (SEQ ID NOs:46 and 48) are shown in Figure 13. 

By comparison to the ETRl Arabidopsis nucleic acid and 
amino acid sequence (see Figures 2 and 3), the QITR 
protein appears to contain an amino terminal portion 
having a relatively high level of homology to the amino 

25 terminal portion of the ETRl protein and a histidine 
kinase portion with a moderate level of homology to the 
same sequence in ETRl. The response regulatory region 
found in ETRl is not present in the QITR protein. The 
overall nucleic acid homology is approximately 69%. 

30 With regard to the amino terminal portion (i.e., 
between residues 1 through 316) the homology is 
approximately 71% identical in terms of amino acid 
s quence and 72% identical in t rms of nucleic acid 
sequence. 
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EXAMPLE 7 

Modification of QITR Nucleic Acid 
to Confer Ethylene Insensit ivity 

An amino acid substitution was made in a 5 kb QITR 
5 genomic clone which was analogous to that for the ETRl- 
4 mutation, namely the substitution of the isoleucine 
at position 62 with phenylalanine. Compare Figure 3A 
with Figure 5A at residue 62. As further indicated at 
Figures 12 and 13, residue 62 in the QITR protein is 
10 also isoleucine as in the ETRl protein. 

The amino acid substitution was made to the QITR 
nucleic acid using oligonucleotide-directed in vitro 
mutagenesis. Kunkel et al . (1987) Methods in 

Enzymology 154:367-382. A Muta-gene kit from Bio-Rad 

15 Laboratories, Hercules, California, was used in 
connection with this particular mutation. The sequence 
of the oligonucleotide used was 5' GGA GCC TTT TTC ATT 
CTC. Replacement of nucleotide A with T in the codon 
ATC changed the amino acid lie at residue 62 to Phe in 

2 0 the deduced protein sequence. 

The QITR nucleic acid spanning approximately 5 kb from 
the first HindllZ site to the second Kpnl site 
contained approximately 2.4 kb of nucleotides upstream 
from the start codon. This 5 kb fragment was ligated 
25 into the pCGN1547 transformation vector {supra.) . This 
construct was then transformed into Agrobacterium 
strain ASE as described supra and then into 
Arabidopsis . 

Seedlings derived from this transformation experiment 
30 were tested for sensitivity to ethylene as previously 
described. Seedlings containing the QITR nucleic acid 
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containing the modification at residue 62 were ethylene 
insensitive. 



EXAMPLE 8 

Identification of Ai-Bbidopsis ETR Nucleic Acid 08 

5 The ETR nucleic acid Q8 (SEQ ID NOs:41 and 43) was 
identified by direct sequence comparison with the ETRl 
nucleic acid from Arabidopsis. The Arabidopsis Q8 
nucleic acid was identified in connection with a 
chromosome walk on chromosome 3 of Arabidopsis 
10 thaliana. 

Briefly, overlapping YAC clones were generated which 
were thereafter subcloned into plasmids- The genomic 
inserts in such plasmids were extricated by digesting 
with restriction endonuclease and hybridized to a cDNA 
15 library from Arabidopsis floral tissue. 

Positively hybridizing inserts were sequenced to 
produce the overall genomic sequence (SEQ ID NO.: 41) 
together with the deduced amino acid sequence (SEQ ID 
NOs:42 and 44) as set forth in Figure 14. The cDNA 
20 sequence (SEQ ID NO: 43) and deduced amino acid sequence 
(SEQ ID NOs:42 and 44) is set forth in Figure 15. 

The overall nucleic acid homology as between the Q8 
nucleic acid and the ETRl nucleic acid is approximately 
69%. With regard to the amino terminal portion 
25 extending from residues 1 through 316, the overall 
amino sequence homology is approximately 72% whereas 
the nucleic acid encoding this sequence is 
approximately has a sequence homology of approximately 
71% as between the Q8 and ETRl nucleic acids. 



EXAMPLE 9 



Isolation of the TETR cDNA 

A ^^P-labeled hybridization probe was prepared by 
random-primer labeling of a 1.3 kb PGR fragment 
generated by PGR amplification of the Arabidopsis ETRl 
gene with the PGR primers "5' BajnHI " 
(GCGGGATCCATAGTGTAAAAAATTGATAATGG) and "3 ' BamHlB" 
(CCGGATCCGTTGAAGACTTCCATCTTCTAACC) . 

This probe was used to screen a cDNA library of red 
tomato fruit mRNA cloned in the EcoRl site of lambda 
ZAP II vector from Stratagene , LaJolla , CA. Twenty 
(20) positive primary plaques were identified that 
hybridized to this probe (2X SSC at 65<>c wash 
conditions) and secondary screens were performed on 
these to obtain pure plaques. In vivo excision was 
then performed with resultant recombinant phage and 19 
independent plasmid clones were obtained. 

Complementary DNAs, from plasmid clones containing the 
largest fragments that hybridized to the ETRl probe, 
were sequenced and the nucleotide sequence and 
predicted amino acid sequences of the longest tomato 
CDNA (TETR14, also referred to as TXTR) were compared 
to the ETRl and QITR sequences. The nucleotide 
sequence of TETR14 predicted that the encoded peptide 
was more similar to the QITR peptide than the ETRl 
peptide. This conclusion was based on the fact that 
the response regulatory domain (which is present in 
ETRl) is absent in both TETR14 and QITR. The sequence 
(or partial sequence) of several of the other cDNA 
clones was determined and they were found to correspond 
to the same gene. 
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FVAMPLE 10 

Analysis nf TETR3 4 Gene Evnression 

Northern analysis was performed with mRNA from 
developing fruits of normal, or mutant tomato (Ripening 
5 inhibitor (rin) , Non-ripening (nor) or Never-ripe (Nr) ) 
fruit. Stages of developing fruits used were mature 
green, breaker, breaker plus 7 days, and mature green 
fruit treated with ethylene. Messenger RNA that 
hybridized to the TETR14 gene probe was not present at 

10 the mature green stage, but was present in breaker, 
breaker plus 7 days, and ethylene treated mature green 
fruit. Thus, it was concluded that accumulation of the 
ETR14 mRNA was regulated by ethylene. Accumulation of 
the TETR14 mRNA was attenuated in all three ripening 

15 mutants, further supporting the finding that mRNA 
accumulation is ethylene regulated. 



EXAMPLE 11 

Analysis of the TETR14 Gene 
from Pearson and Neve r-rioe DNA 

20 PGR primers were obtained that would specifically 
amplify the N-terminal region of the TETR14 gene. The 
amplified portion was between Metl and Ile214 in Figs. 
16A and 16B. The primers were 

( CCGGATCCATGGAATCCTGTGATTGCATTG ) 

25 and TETR4A (GATAATAGGAAGATTAATTGGC) . PGR conditions 
(Perkin-Elmer Cetus) : 1 ug of tomato genomic DNA, 40 
picomole of each primer, 1 min 94»C, 2 min 45'C, 2 min 
72 "C, 35 cycles. PCR products, obtained with these 
primers, resulting from two independent amplification 

30 reactions of pearson and Nr DNA were agarose gel 
purified and subcloned into either the T/A vector 
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(Invitrogen) or digested with BamHl and Xhol and 
subcloned into Bluescript KS- that had been linearized 
with BamHl and Sail. Single stranded template DNA was 
prepared from the resultant plasmids and sequenced. 
5 The sequence of the PGR products from the pearson DNA 
were identical to the sequence of the TETR14 clone. 
Sequence analysis revealed that the PGR fragments 
resulting from PGR of the Nr DNA (TETR14-Nr) were not 
identical to those obtained from the Pearson DNA, The 
10 cytosine nucleotide at position 395 of the TETR14 gene 
is a thymine in the gene amplified from the Nr DNA. 
This nucleotide substitution in TETR14-Nr changes the 
proline at amino acid position 36 of the predicted 
peptide to a leucine. See Fig. 22 and Seq. ID Nos. 49 
15 and 50 for the overall nucleic acid and amino acid 
sequence respectively. This Pro-36 of the TETR14 
corresponds to the Pro-3 6 of the ETEl peptide and to 
the Pro-3 6 of the QITR peptide. This results indicates 
that a mutation in the tomato TETR14 gene confers 
20 dominant ethylene-insensitivity . And thus, it is 
possible to predict that other changes in the TETR14 
gene and other tomato ETEl homologues will result in 
ethylene insensitivity in tomato. 



Having described the preferred embodiments of the 
25 invention, it will appear to those of ordinary skill in 
the art that various modifications may be made to the 
disclosed embodiments, and that such modifications are 
intended to be within the scope of the invention. 

All references are expressly incorporated herein by 
3 0 reference. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Meyerowitz, Elliott M. 
Chang, Caren 
Bleecker, Anthony B. 

TITLE OF INVENTION: PLANTS HAVING MODIFIED RESPONSE TO 
^ ETHYLENE 

(iii) NUMBER OF SEQUENCES: 50 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Richard F. Trecartin 

(B) STREET: 3400 Embarcadero Center, Suite 3400 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94111 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US94/ 

(B) FILING DATE: Ol-JUL-1994 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/086,555 

(B) FILING DATE: Ol-JUL-1993 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Trecartin, Richard F. 

(B) REGISTRATION NUMBER: 31,801 

(C) REFERENCE/DOCKET NUMBER: FP57515-1RFT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 781-1989 

(B) TELEFAX: (415) 398-3249 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3879 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AAAGATAGTA rTTGrTOATA AATATCGGGA TATTTATCCT ATATTATCTG TATTTITCTr 60 
ACCAm-TTA CTCTATTCCT TTATCTACAT TACGTCATTA CACTATCATA AGATATTTGA 120 
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ATGAACAAAT TCATGCACCC ACCAGCTATA TTACCCTTTT TTATTAAAAA AAAACATCTG 130 
ATAATAATAA CAAAAAAATT AGAGAAATGA CGTCGAAAAA AAAAGTAAGA ACGAAGAAGA 240 
AGTGTTAAAC CCAACCAATT TTGACTTGAA AAAAAGCTTC AACGCTCCCC TTTTCTCCTT 300 
CTCCGTCGCT CTCCGCCGCG TCCCAAATCC CCAATTCCTC CTCTTCTCCG ATCAATTCTT 3 60 
CCCAAGTAAG CTTCTTCTTC CTCGATTCTC TCCTCAGATT GTTTCGTGAC TTCTTTATAT 420 
ATATTCTTCA CTTCCACAGT TTTCTTCTGT TGTTGTCGTC GATCTCAAAT CATAGAGATT 480 
GATTAACCTA ATTGGTCTTT ATCTAGTGTA ATGCATCGTT ATTAGGAACT TTAAATTAAG 540 
ATTTAATCGT TAATTTCATG ATTCGGATTC GAATTTTACT GTTCTCGAGA CTGAAATATG 600 
CAACCTATTT TTTCGTAATC GTTGTGATCG AATTCGATTC TTCAGAATTT ATAGCAATTT 660 
TGATGCTCAT GATCTGTCTA CGCTACGTTC TCGTCGTAAA TCGAAGTTGA TAATGCTATG 720 
TGTTTGTTAC ACAGGTGTGT GTATGTGTGA GAGAGGAACT ATAGTGTAAA AAATTCATAA 780 
TGGAAGTCTG CAATTGTATT GAACCGCAAT GGCCAGCGGA TGAATTGTTA ATGAAATACC 840 
AATACATCTC CGATTTCTTC ATTGCGATTG CGTATTTTTC GATTCCTCTT GAGTTGATTT 900 
ACTTTGTGAA GAAATCAGCC GTGTTTCCGT ATAGATGGGT ACTTGTTCAG TTTGGTCCTT 9 60 
TTATCGTTCT TTGTGGAGCA ACTCATCTTA TTAACTTATG GACTTTCACT ACGCATTCGA 1020 
GAACCGTGGC GCTTGTGATG ACTACCGCGA AGGTGTTAAC CGCTGTTGTC TCGTGTGCTA 1080 
CTGCGTTGAT GCTTGTTCAT ATTATTCCTG ATCTTTTGAG TGTTAAGACT CGGGAGCTTT 1140 
TCTTGAAAAA TAAAGCTGCT GAGCTCGATA GAGAAATGGG ATTGATTCGA ACTCAGGAAG 1200 
AAACCGGAAG GCATGTGAGA ATGTTGACTC ATGAGATTAG AAGCACTTTA GATAGACATA 1260 
CTATTTTAAA GACTACACTT GTTGAGCTTG GTAGGACATT AGCTTTGGAG GAGTGTGCAT 1320 
TGTGGATGCC TACTAGAACT GGGTTAGAGC TACAGCTTTC TTATACACTT CGTCATCAAC 1380 
ATCCCGTGGA GTATACGGTT CCTATTCAAT TACCGGTGAT TAACCAAGTG TTTGGTACTA 1440 
GTAGGGCTGT AAAAATATCT CCTAATTCTC CTGTGGCTAG GTTGAGACCT GTTTCTGGGA 1500 
AATATATGCT AGGGGAGGTG GTCGCTGTGA GGGTTCCGCT TCTCCACCTT TCTAATTTTC 1560 
AGATTAATGA CTGGCCTGAG CTTTCAACAA AGAGATATGC TTTGATGGTT TTGATGCTTC 1620 
CTTCAGATAG TGCAAGGCAA TGGCATGTCC ATGAGTTGGA ACTCGTTGAA GTCGTCGCTG 1680 
ATCAGGTTTT ACATTGCTGA GAATTTCTCT TCTTTGCTAT GTTCATGATC TTGTCTATAA 1740 
CTTTTCTTCT CTTATTATAG GTGGCTGTAG CTCTCTCACA TGCTGCGATC CTAGAAGAGT 1800 
CGATGCGAGC TAGGGACCTT CTCATGGAGC AGAATGTTGC TCTTGATCTA GCTAGACGAG 1860 
AAGCAGAAAC AGCAATCCGT GCCCGCAATG ATTTCCTAGC GGTTATGAAC CATGAAATGC 1920 
GAACACCGAT GC ATGCGATT ATTGCACTCT CTTCCTTACT CCAAGAAACG GAACTAACCC 1980 
CTGAACAAAG ACTGATGGTG GAAACAATAC TTAAAAGTAG TAACCTTTTG GCAACTTTGA 2040 
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TGAATGATGT CTTAGATCTT TCAAGGTTAG AAGATGGAAG TCTTCAACTT GAACTTGGGA 210 0 
CATTCAATCT TCATACATTA TTTAGAGAGG TAACTTTTGA ACAGCTCTAT GTTTCATAAG 2160 
TTTATACTAT TTGTGTACTT GATTGTCATA TTGAATCTTG TTGCAGGTCC TCAATCTGAT 222 0 
AAAGCCTATA GCGGTTGTTA AGAAATTACC CATCACACTA AATCTTGCAC CAGATTTGCC 2280 
AGAATTTGTT GTTGGGGATG AGAAACGGCT AATGCAGATA ATATTAAATA TAGTTGGTAA 2340 
TGCTGTGAAA TTCTCCAAAC AAGGTAGTAT CTCCGTAACC GCTCTTGTCA CCAAGTCAGA 2400 
CACACGAGCT GCTGACTTTT TTGTCGTGCC AACTGGGAGT CATTTCTACT TGAGAGTGAA 2460 
GGTTATTATC TTGTATCTTG GGATCTTATA CCATAGCTGA AAGTATTTCT TAGGTCTTAA 252 0 
TTTTGATGAT TATTCAAATA TAGGTAAAAG ACTCTGGAGC AGGAATAAAT CCTCAAGACA 2580 
TTCCAAAGAT TTTCACTAAA TTTGCTCAAA CACAATCTTT AGCGACGAGA AGCTCGGGTG 2 640 
GTAGTGGGCT TGGCCTCGCC ATCTCCAAGA GGTTTGAGCC TTATTAAAAG ACGTTTTTTT 2700 
CCAACTTTTT CTTGTCTTCT GTGTTGTTAA AAGTTTACTC ATAAGCGTTT AATATGACAA 27 60 
GGTTTGTGAA TCTGATGGAG GGTAACATTT GGATTGAGAG CGATGGTCTT.GGAAAAGGAT 2820 
GCACGGCTAT CTTTGATGTT AAACTTGGGA TCTCAGAACG TTCAAACGAA TCTAAACAGT 2880 
CGGGCATACC GAAAGTTCCA GCCATTCCCC GACATTCAAA TTTCACTGGA CTTAAGGTTC 2940 
TTGTCATGGA TGAGAACGGG TTAGTATAAG CTTCTCACCT TTCTCTTTGC AAAATCTCTC 3 000 
GCCTTACTTC TTGCAAATGC AGATATTGGC GTTTAGAAAA AACGCAAATT TAATCTTATG 3 060 
AGAAACCGAT GATTATTTTG GTTGCAGGGT AAGTAGAATG GTGACGAAGG GACTTCTTGT 3120 
ACACCTTGGG TGCGAAGTGA CCACGGTGAG TTCAAACGAG GAGTGTCTCC GAGTTGTGTC 3180 
CCATGAGCAC AAAGTGGTCT TCATGGACGT GTGCATGCCC GGGGTCGAAA ACTACCAAAT 3240 
CGCTCTCCGT ATTCACGAGA AATTCACAAA ACAACGCCAC CAACGGCCAC TACTTGTGGC 3300 
ACTCAGTGGT AACACTGACA AATCCACAAA AGAGAAATGC ATGAGCTTTG GTCTAGACGG 3360 
TGTGTTGCTC AAACCCGTAT CACTAGACAA CATAAGAGAT GTTCTGTCTG ATCTTCTCGA 3 420 
GCCCCGGGTA CTGTACGAGG GCATGTAAAG GCGATGGATG CCCCATGCCC CAGAGGAGTA 3 480 
ATTCCGCTCC CGCCTTCTTC TCCCGTAAAA CATCGGAAGC TGATGTTCTC TGGTTTAATT 3 540 
GTGTACATAT CAGAGATTGT CGGAGCGTTT TGGATGATAT CTTAAAACAG AAAGGGAATA 3600 
ACAAAATAGA AACTCTAAAC CGGTATGTGT CCGTGGCGAT TTCGGTTATA GAGGAACAAG 3660 
ATGGTGGTGG TATAATCATA CCATTTCAGA TTACATGTTT GACTAATGTT GTATCCTTAT 3720 
ATATCTAGTT ACATTCTTAT AAGAATTTGG ATCGAGTTAT GGATGCTTGT TGCGTGCATG 37 80 
TATGACATTC ATCCAGTATT ATGGCGTCAG CTTTGCGCCG CTTAGTAGAA CAACAACAAT 3 840 
GGCGTTACTT AGTTTCTCAA TCAACCCGAT CTCCAAAAC 3879 
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(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 188.. 2401 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA AAGCTTCAAC 60 

GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC CAAATCCCCA ATTCCTCCTC 120 

TTCTCCGATC AATTCTTCCC AAGTGTGTGT ATGTGTGAGA GAGGAACTAT AGTGTAAAAA 180 

ATTCATA ATG GAA GTC TGC AAT TGT ATT GAA CCG CAA TGG CCA GCG GAT 229 
Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp 

1 . 5 10 

GAA TTG TTA ATG AAA TAC CAA TAC ATC TCC GAT TTC TTC ATT GCG ATT 277 
Glu Leu Leu Met Lys Tyr Gin Tyr lie Ser Asp Phe Phe lie Ala He 
15 20 25 30 

GCG TAT TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA " 325 
Ala Tyr Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser 
35 40 45 ' 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT TTT ATC 373 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Glh Phe Gly Ala Phe He 
50 55 60 

GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG ACT TTC ACT ACG ' 421 
Val Leu Cys Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr 
65 70 75 

CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT ACC GCG AAG GTG TTA ACC 469 
His Ser Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr 
80 85 90 

GCT GTT GTC TCG TGT GCT ACT GCG TTG ATG CTT GTT CAT ATT ATT CCT 517 
Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro 
95 100 105 110 

GAT CTT TTG AGT GTT AAG ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT 565 
Asp Leu Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala 
115 120 125 

GCT GAG CTC GAT AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC 613 
Ala Glu Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr 
130 135 140 

GGA AGG CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
Gly Arg His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 155 
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AGA CAT ACT ATT TTA AAG ACT AC A CTT GTT GAG CTT GGT AGG ACA TTA 7 09 
Arg His Thr lie Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu 
160 165 170 

GCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA ACT GGG TTA GAG 757 
Ala Leu Glu Glu Cvs Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu 
175 ^ 180 185 190 

CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA CAT CCC GTG GAG TAT ACG 805 
Leu Gin Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr 
195 200 205 



GTT CCT ATT CAA TTA CCG GTG ATT AAC CAA GTG TTT GGT ACT AGT AGG 853 
Val Pro lie Gin Leu Pro Val lie Asn Gin Val Phe Gly Thr Ser Arg 
210 215 220 

GCT GTA AAA ATA TCT CCT AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT 901 
Ala Val Lys lie Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val 
225 230 235 

TCT GGG AAA TAT ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT 949 
Ser Gly Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu 
240 245 250 

CTC CAC CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu His Leu Ser Asn Phe Gin lie Asn Asp Trp Pro Glu Leu Ser Thr 
255 260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT GCA AGG 1045 
Lvs Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg 
275 280 285 

CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC GTC GCT GAT CAG 1093 
Gin Trp His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin 
290 295 300 

GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC CTA GAA GAG TCG ATG CGA 1141 
Val Ala Val Ala Leu Ser His Ala Ala lie Leu Glu Glu Ser Met Arg 
305 310 315 

GCT AGG GAC CTT CTC ATG GAG CAG AAT GTT GCT CTT GAT CTA GCT AGA 1189 
Ala Ara Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg 
320 325 330 

CGA GAA GCA GAA ACA GCA ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT 1237 
Arg Glu Ala Glu Thr Ala lie Arg Ala Arg Asn Asp Phe Leu Ala Val 
335 340 345 350 

ATC AAC CAT GAA ATG CGA ACA CCG ATC CAT GCG ATT ATT GCA CTC TCT 1285 
Met Asn His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser 
355 360 365 

TCC TTA CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTC ATC GTC 1333 
Ser Leu Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTC GCA ACT TTC ATC AAT GAT 1381 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp 
385 390 395 

GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT CAA CTT GAA CTT 1429 
Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu 
400 405 410 
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GGG AC A TTC AAT CTT CAT AC A TTA TTT AGA GAG GTC CTC AAT CTG ATA 1477 
Gly Thr Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He 

420 425 430 

AAG CCT ATA GCG GTT GTT AAG AAA TTA CCC ATC AC A CTA AAT CTT GCA 1525 
Lys Pro He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala 
435 440 445 

CCA GAT TTG CCA GAA TTT GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG 1573 
Pro Asp Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin 
450 455 460 

ATA ATA TTA AAT ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT 1621 
He He Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Glv 
465 470 475 

AGT ATC TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser He Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 
480 485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAG TTG AGA GTG AAG 1717 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lvs 
495 500 505 510 

GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC ATT CCA AAG ATT 1765 
Val Lys Asp Ser Gly Ala Gly He Asn Pro Gin Asp He Pro Lys He 
515 520 525 

TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA GCG ACG AGA AGC TCG GGT 1813 
Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Glv 
530 535 540 

GGT AGT GGG CTT GGC CTC GCC ATC TCC AAG AGG TTT GTG AAT CTG ATG 1861 
Gly Ser Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met 
545 ' 550 555 

GAG GGT AAC ATT TGG ATT GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG 1909 
Glu Gly Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr 
560 565 570 

GCT ATC TTT GAT GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT 1957 
Ala He Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser 
575 580 585 590 

AAA CAG TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Lys Gin Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
595 600 605 

TTC ACT GGA CTT AAG GTT CTT GTC ATG GAT GAG AAC GGG GTA AGT AGA 2053 
Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg 
610 615 620 

ATG GTG ACG AAG GGA CTT CTT GTA CAC CTT GGG TGC GAA GTC ACC ACG 2101 
Met Val Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr 
625 630 635 

GTC AGT TCA AAC GAG GAG TCT CTC CGA GTT GTC TCC CAT GAG CAC AAA 2149 
Val Ser Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys 
640 645 650 

GTC GTC TTC ATC GAC GTC TCC ATC CCC GGG GTC GAA AAC TAC CAA ATC 2197 
Val Val Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He 
655 660 665 670 
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GCT CTC GOT ATT CAC GAG AAA TTC ACA AAA CAA CGC CAC CAA CGG CCA 2245 
Ala Leu Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro 
675 680 685 

CTA CTT GTG GCA CTC AGT GGT AAC ACT GAC AAA TCC ACA AAA GAG AAA 2293 
Leu Leu Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys 
690 695 700 

TGC ATG AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 2341 
CVS Met Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
705 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG GTA CTG 2389 
AsD Asn He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu 
720 725 730 

TAC GAG GGC ATG TAAAGGCGAT GGATGCCCCA TGCCCCAGAG GAGTAATTCC 2441 

Tyr Glu Gly Met 

735 

GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG TTCTCTGGTT TAATTGTGTA 2501 
CATATCAGAG ATTGTCGGAG CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA 2561 
ATAGAAACTC TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 
GGTGGTATAA TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC CTTATATATG 2681 
TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG CTTGTTGCGT GCATGTATGA 2741 
CATTGATGCA GTATTATGGC GTCAGCTTTG CGCCGCTTAG TAGAAC 2787 



(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 738 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Met Glu Val 
1 

Leu Met Lys 



Phe Ser He 
35 

Phe Pro Tyr 
50 

Cys Gly Ala 
65 

Arg Thr Val 



Val Ser Cys 



Cvs Asn Cys He Glu Pro Gin Trp Pro Ala Asp Glu Leu 
5 10 15 

Tyr Gin Tyr He Ser Asp Phe Phe He Ala He Ala Tyr 
20 25 30 

Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val 
40 45 

Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val Leu 
55 60 

Thr His Leu He Asn Leu Trp Thr Phe Thr Thr His Ser 
70 75 80 

Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr Ala Val 
85 90 95 

Ala Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 105 110 
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Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu lie Arg Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

His Val Arg Met Leu Thr His Glu lie Arg Ser Thr Leu Asp Arg His 
145 150 155 160 

Thr lie Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala Leu 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu Leu Gin 
180 185 190 

Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr Val Pro 
195 200 205 

lie Gin Leu Pro Val lie Asn Gin Val Phe Gly Thr Ser Arg Ala Val 
210 215 220 

Lys lie Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly 
225 230 235 240 

Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin lie Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg 
260 265 270 

Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin Trp 
275 280 285 

His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin Val Ala 
290 295 300 

Val Ala Leu Ser His Ala Ala lie Leu Glu Glu Ser Met Arg Ala Arg 
305 310 315 320 

Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu 
325 330 335 

Ala Glu Thr Ala lie Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thr Pro Met His Ala lie lie Ala Leu Ser Ser Leu 
355 360 365 

Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr 
370 375 380 

lie Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp Val Leu 
385 390 395 400 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu Gly Thr 
405 410 415 

Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu lie Lys Pro 
420 425 430 

lie Ala Val Val Lys Lys Leu Pro lie Thr Leu Asn Leu Ala Pro Asp 
435 440 445 
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Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin lie lie 
450 455 460 

Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser He 
465 470 475 480 

Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala Asp Phe 
485 490 495 

Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys Val Lys 
500 505 510 

Asp Ser Gly Ala Gly lie Asn Pro Gin Asp He Pro Lys He Phe Thr 
515 520 525 

Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly Gly Ser 
530 535 540 

Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met Glu Gly 
545 550 555 560 

Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He 
565 570 575 

Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn Phe Thr 
595 600 605 

Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg Met Val 
610 615 620 

Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr Val Ser 
625 630 635 640 

Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys Val Val 
645 650 655 

Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He Ala Leu 
660 665 670 

Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu 
675 680 685 

Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 
690 695 700 

Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu Asp Asn 
705 710 715 720 

He Afg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu Tyr Glu 
725 730 735 

Gly Met 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s i ng 1 e 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(3) LOCATION: IBS.. 2401 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA AAGCTTCAAC 60 

GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC CAAATCCCCA ATTCCTCCTC 120 

TTCTCCGATC AATTCTTCCC AAGTGTGTGT ATGTGTGAGA GAGGAACTAT AGTGTAAAAA 180 

ATTCATA ATG GAA GTC TGC AAT TGT ATT GAA CCG CAA TGG CCA GOG GAT 229 
Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp 
15 10 

GAA TTG TTA ATG AAA TAG CAA TAC ATC TCC GAT TTC TTC ATT GCG ATT 277 
Glu Leu Leu Met Lys Tyr Gin Tyr lie Ser Asp Phe Phe lie Ala lie 
15 20 25 . 30 

GCG TAT TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 
Ala Tyr Phe Ser lie Pro Leu Glu Leu lie Tyr Phe Val Lys Lys Ser 
35 40 45 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT TTT ATC 373 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe lie 
50 55 60 

GTT CTT TAT GGA GCA ACT CAT CTT ATT AAC TTA TGG ACT TTC ACT ACG 421 
Val Leu Tyr Gly Ala Thr His Leu lie Asn Leu Trp Thr Phe Thr Thr 
65 70 75 

CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT ACC GCG AAG GTG TTA ACC 469 
His Ser Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr 
80 85 90 

GCT GTT GTC TCG TGT GCT ACT GCG TTG ATG CTT GTT CAT ATT ATT CCT 517 
Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu Val His lie lie Pro 
95 100 105 110 

GAT CTT TTG AGT GTT AAG ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT 565 
Asp Leu Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala 
115 120 125 

GCT GAG CTC GAT AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC 613 
Ala Glu Leu Asp Arg Glu Met Gly Leu lie Arg Thr Gin Glu Glu Thr 
130 135 140 

GGA AGG CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
Gly Arg His Val Arg Met Leu Thr His Glu lie Arg Ser Thr Leu Asp 
145 150 155 

AGA CAT ACT ATT TTA AAG ACT ACA CTT GTT GAG CTT GGT AGG ACA TTA 709 
Arg His Thr lie Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu 
160 165 170 

GCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA ACT GGG TTA GAG 757 
Ala Leu Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu 
175 180 185 190 
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CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA CAT CCC GTG GAG TAT ACG 805 
Leu Gin Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr 
195 200 205 

GTT CCT ATT CAA TTA CCG GTG ATT AAC CAA GTG TTT GGT ACT AGT AGG 853 
Val Pro lie Gin Leu Pro Val He Asn Gin Val Phe Gly Thr Ser Arg 
210 215 220 

GCT GTA AAA ATA TCT CCT AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT 901 
Ala Val Lys He Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val 
225 230 235 

TCT GGG AAA TAT ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT 949 
Ser Gly Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu 
240 • 245 250 

CTC CAC CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu His Leu Ser Asn Phe Gin lie Asn Asp Trp Pro Glu Leu Ser Thr 
255 260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT GCA AGG 1045 
Lvs Ara Tvr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg 
^ 275 280 285 

CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC GTC GCT GAT CAG 1093 
Gin Trp His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin 
290 295 300 

GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC CTA GAA GAG TCG ATG CGA 1141 
Val Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg 
305 310 315 

GCT AGG GAC CTT CTC ATG GAG CAG AAT GTT GCT CTT GAT CTA GCT AGA 1189 
Ala Arg Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg 
320 325 330 

CGA GAA GCA GAA ACA GCA ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT 1237 
Arg Glu Ala Glu Thr Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val 
335 ^ 340 345 350 

ATG AAC CAT GAA ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT 1285 
Met Asn His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser 
355 360 365 

TCC TTA CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 
Ser Leu Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG AAT GAT 1381 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp 
385 390 395 

GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT CAA CTT GAA CTT 1429 
Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu 
400 405 410 

GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA GAG GTC CTC AAT CTG ATA 1477 
Glv Thr Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He 
415 420 425 430 

AAG CCT ATA GCG GTT GTT AAG AAA TTA CCC ATC ACA CTA AAT CTT GCA 1525 
Lvs Pro He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala 
^ 435 440 445 



wo 95/01439 



PCTAJS94/07418 



67 



CCA GAT TTG CCA GAA TTT GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG 157 3 
Pro Asp Leu Pro Giu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin 
450 455 460 

ATA ATA TTA AAT ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT 1621 
lie lie Leu Asn lie Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly 
465 470 475 

AGT ATC TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC AC A CGA GCT GCT 1669 
Ser lie Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 
480 485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTG AGA GTG AAG 1717 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys 
495 500 505 510 

GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC ATT CCA AAG ATT 17 65 
Val Lys Asp Ser Gly Ala Gly lie Asn Pro Gin Asp lie Pro Lys lie 
515 520 525 

TTC ACT AAA TTT GCT CAA AC A CAA TCT TTA GCG ACG AGA AGC TCG GGT 1813 
Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly 
530 535 540 

GGT AGT GGG CTT GGC CTC GCC ATC TCC AAG AGG TTT GTG AAT CTG ATG 1861 
Gly Ser Gly Leu Gly Leu Ala lie Ser Lys Arg Phe Val Asn Leu Met 
545 550 555 

GAG GGT AAC ATT TGG ATT GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG 1909 
Glu Gly Asn lie Trp lie Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr 
560 565 570 

GCT ATC TTT GAT GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT 1957 
Ala lie Phe Asp Val Lys Leu Gly lie Ser Glu Arg Ser Asn Glu Ser 
575 580 585 590 

AAA CAG TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Lys Gin Ser Gly lie Pro Lys Val Pro Ala lie Pro Arg His Ser Asn 
595 600 605 

TTC ACT GGA CTT AAG GTT CTT GTC ATG GAT GAG AAC GGG GTA AGT AGA 2053 
Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg 
610 615 620 

ATG GTG ACG AAG GGA CTT CTT GTA CAC CTT GGG TGC GAA GTG ACC ACG 2101 
Met Val Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr 
625 630 635 

GTG AGT TCA AAC GAG GAG TGT CTC CGA GTT GTG TCC CAT GAG CAC AAA 2149 
Val Ser Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys 
640 645 650 

GTG GTC TTC ATG GAC GTG TGC ATG CCC GGG GTC GAA AAC TAC CAA ATC 2197 
Val Val Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin lie 
655 660 665 670 

GCT CTC CGT ATT CAC GAG AAA TTC AC A AAA CAA CGC CAC CAA CGG CCA 2245 
Ala Leu Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro 
675 680 685 

CTA CTT GTG GCA CTC AGT GGT AAC ACT GAC AAA TCC ACA AAA GAG AAA 2293 
Leu Leu Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys 
690 695 700 
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TGC ATG AGC TTT GOT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 23 41 
Cys Met Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
705 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG GTA CTG 23 89 
Asp Asn He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu 
720 725 730 

TAC GAG GGC ATG TAAAGGCGAT GGATGCCCCA TGCCCCAGAG GAGTAATTCC 2441 

Tyr Glu Gly Met 

735 

GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG TTCTCTGGTT TAATTGTGTA 2501 
CATATCAGAG ATTGTCGGAG CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA 2561 
ATAGAAACTC TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 
GGTGGTATAA TCATACCATT TC AG ATT AC A TGTTTGACTA ATGTTGTATC CTTATATATG 2681 
TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG CTTGTTGCGT GCATGTATGA 2741 
CATTGATGCA GTATTATGGC GTCAGCTTTG CGCCGCTTAG TAGAAC 2787 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 38 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Glu Val Cys Asn Cys He Glu Pro Gin Trp Pro Ala Asp Glu Leu 
1 5 10 15 

Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala He Ala Tyr 
20 25 30 

Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val 
35 40 45 

Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val Leu 
50 55 60 

Tyr Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr His Ser 
65 70 75 80 

Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr Ala Val 
85 90 95 

Val Ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His 
145 150 155 160 
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Thr lie Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala Leu 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu Leu Gin 
180 185 190 

Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr Val Pro 
195 200 205 

He Gin Leu Pro Val He Asn Gin Val Phe Gly Thr Ser Arg Ala Val 
210 215 . 220 

Lys He Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly 
225 230 235 240 

Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg 
260 265 270 

Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin Trp 
275 280 285 

His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin Val Ala 
290 295 300 

Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg Ala Arg 
305 310 315 320 

Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu 
325 330 335 

Ala Glu Thr Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 
355 360 365 

Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr 
370 375 380 

He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp Val Leu 
385 390 395 400 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu Gly Thr 
405 410 415 

Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He Lys Pro 
420 425 430 

He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala Pro Asp 
435 440 445 

Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin He He 
450 455 460 

Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser He 
465 470 475 480 

Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala Asp Phe 
485 490 495 
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Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys Val Lys 
500 505 510 

Asp Ser Gly Ala Gly He Asn Pro Gin Asp He Pro Lys He Phe Thr 
515 520 525 

Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly Gly Ser 
530 535 540 

Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met Glu Gly 
545 550 555 560 

Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He 
565 570 575 

Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

Ser Glv He Pro Lys Val Pro Ala He Pro Arg His Ser Asn Phe Thr 
595 600 605 

Glv Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg Met Val 
610 615 620 

Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr Val Ser 
625 630 635 640 

Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys Val Val 
645 650 655 

Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He Ala Leu 
660 665 670 

Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu 
675 680 685 

Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 
690 695 700 

Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu Asp Asn 
705 710 715 720 

He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu Tyr Glu 
725 730 735 

Gly Met 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 188.. 2401 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA AAGCTTCAAC 60 

GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC CAAATCCCCA ATTCCTCCTC 120 

TTCTCCGATC AATTCTTCCC AAGTGTGTGT ATGTGTGAGA GAGGAACTAT AGTGTAAAAA 180 

ATTCATA ATG GAA GTC TGC AAT TGT ATT GAA COG CAA TGG CCA GCG GAT 229 
Met Glu Val Cys Asn Cys He Glu Pro Gin Trp Pro Ala Asp 
15 10 

GAA TTG TTA ATG AAA TAG CAA TAC ATC TCC GAT TTC TTC ATT GCG ATT 277 
Glu Leu Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala He 
15 20 25 30 

GCG TAT TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 
Ala Tyr Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser 
35 40 45 

GCC GTG TTT CCG TAT AG A TGG GTA CTT GTT CAG TTT GGT GCT TTT ATC 373 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe He 
50 55 60 

GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG ACT TTC ACT ACG 421 
Val Leu Cys Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr 
65 70 75 

CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT ACC GCG AAG GTG TTA ACC 469 
His Ser Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr ' 
80 85 90 

GCT GTT GTC TCG TGT GCT ACT ACG TTG ATG CTT GTT CAT ATT ATT CCT 517 
Ala Val Val Ser Cys Ala Thr Thr Leu Met Leu Val His He He Pro 
95 100 105 110 

GAT CTT TTG AGT GTT AAG ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT 565 
Asp Leu Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala 
115 120 125 

GCT GAG CTC GAT AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC 613 
Ala Glu Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr 
130 135 140 

GGA AGG CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
Gly Arg His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 155 

AGA CAT ACT ATT TTA AAG ACT ACA CTT GTT GAG CTT GGT AGG ACA TTA 709 
Arg His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu 
160 165 170 

GCT TTG GAG GAG TGT GCA TTC TGG ATG CCT ACT AGA ACT GGG TTA GAG 757 
Ala Leu Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu 
175 180 185 190 

CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA CAT CCC GTG GAG TAT ACG 805 
Leu Gin Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr 
195 200 205 

GTT CCT ATT CAA TTA CCG GTC ATT AAC CAA GTG TTT GGT ACT AGT AGG 853 
Val Pro He Gin Leu Pro Val He Asn Gin Val Phe Gly Thr S r Arg 
210 215 220 
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OCT GTA AAA ATA TCT CCT AAT TCT CCT GTG GCT AGG TTG AGA OCT GTT 901 
Ala Val Lys He Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val 
225 230 235 

TCT GGG AAA TAT ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT 949 
Ser Gly Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu 
240 245 250 

CTC CAC CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA AC A 997 
Leu His Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
255 260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT GCA AGG 1045 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg 
275 280 285 

CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC GTC GCT GAT CAG 1093 
Gin Trp His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin 
290 295 300 

GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC CTA GAA GAG TCG ATG CGA 1141 
Val Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg 
305 310 315 

GCT AGG GAC CTT CTC ATG GAG CAG AAT GTT GCT CTT GAT CTA GCT AGA 1189 
Ala Arq Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg 
320 325 330 

CGA GAA GCA GAA ACA GCA ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT 1237 
Arg Glu Ala Glu Thr Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val 
335 340 345 350 

ATG AAC CAT GAA ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT 1285 
Met Asn His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser 
355 360 365 

TCC TTA CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 
Ser Leu Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG AAT GAT 1381 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp 
385 390 395 

GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT CAA CTT GAA CTT 1429 
Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu 
400 405 410 

GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA GAG GTC CTC AAT CTG ATA 1477 
Gly Thr Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He 
415 420 425 430 

AAG CCT ATA GCG GTT GTT AAG AAA TTA CCC ATC ACA CTA AAT CTT GCA 1525 
Lys Pro He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala 
435 440 445 

CCA GAT TTG CCA GAA TTT GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG 1573 
Pro Asp Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin 
450 455 460 

ATA ATA TTA AAT ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT 1621 
He He Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly 
465 470 475 



wo 95/01439 



PCT/US94/07418 



73 



AGT ATC TCC GTA ACC OCT CTT GTC ACC AAG TCA GAC AC A CGA GCT GCT 1669 
Ser lie Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 
480 485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTC AGA GTG AAG 1717 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lvs 
495 500 505 510 

GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC ATT CCA AAG ATT 1765 
Val Lys Asp Ser Gly Ala Gly He Asn Pro Gin Asp He Pro Lys He 
515 520 525 

TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA GCG ACG AGA AGC TCG GGT 1813 
Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Glv 
530 535 540 

GGT AGT GGG CTT GGC CTC GCC ATC TCC AAG AGG TTT GTG AAT CTG ATG 1861 
Gly Ser Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met 
545 550 555 

GAG GGT AAC ATT TGG ATT GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG 1909 
Glu Gly Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr 
560 565 570 

GCT ATC TTT GAT GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT 1957 
Ala He Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser 
575 580 585 590 

AAA CAG TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Lys Gin Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
595 600 605 

TTC ACT GGA CTT AAG GTT CTT GTC ATG GAT GAG AAC GGG GTA AGT AGA 2053 
Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Aro ' 
610 615 620 

w'^ AAG GGA CTT CTT GTA CAC CTT GGG TGC GAA GTG ACC ACG 2101 

Met Val Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr 
625 630 635 

GTG AGT TCA AAC GAG GAG TGT CTC CGA GTT GTG TCC CAT GAG CAC AAA 2149 
Val Ser Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lvs 
640 645 650 

GTG GTC TTC ATG GAC GTG TGC ATG CCC GGG GTC GAA AAC TAC CAA ATC 2197 
Val Val Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He 
655 660 665 670 

GCT CTC CCT ATT CAC GAG AAA TTC ACA AAA CAA CGC CAC CAA CGG CCA 2245 
Ala Leu Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro 
675 680 685 

CTA CTT GTG CCA CTC AGT GGT AAC ACT GAC AAA TCC ACA AAA GAG AAA 2293 
Leu Leu Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys 
690 695 700 

TGC ATG AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 2341 
Cys Met Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
705 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG GTA CTG 2389 
Asp Asn He Arg Asp Val Leu Ser Asp I^u Leu Glu Pro Arg Val Leu 
720 725 730 
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TAC GAG GGC ATC TAAAGGCGAT GGATGCCCCA TGCCCCAGAG GAGTAATTCC 2441 

Tyr Glu Gly Met 

735 

GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG TTCTCTGGTT TAATTGTGTA 2501 
CATATCAGAG ATTGTCGGAG CGTTrTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA 2561 
ATAGAAACTC TAAACCGGTA TGTGTCCGTG GCGATITCGG TTATAGAGGA ACAAGATGGT 2621 
GGTGGTATAA TCATACCATT TCAGATTACA TGTTTCACTA ATGTTGTATC CTTATATATG 2681 
TAGTTACArr CTTATAAGAA TTTGGATCGA GTTATGGATG CrTGTTGCGT GCATGTATGA 2741 
CATOGATGCA GTATTATCGC GTCAGCTTTG CGCCGCTTAG TAGAAC 2787 

(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 738 amino acids 

(B) TYPE: aanino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

Met Glu Val Cys Asn Cys He Glu Pro Gin Trp Pro Ala Asp Glu Leu 
1 5 10 15 

Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala lie Ala Tyr 
20 25 30 

Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val 
35 40 45 

Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val Leu 
50 55 €0 

cys Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr His Ser 
€5 70 75 80 

Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr Ala Val 
^ Qc 90 95 



85 90 

Val Ser Cys Ala Thr Thr Leu Met Leu Val His He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His 
145 150 155 160 

Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala L u 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu Leu Gin 
180 185 190 
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Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr Val Pro 
195 200 205 

He Gin Leu Pro Val He Asn Gin Val Phe Gly Thr Ser Arg Ala Val 
210 215 220 

Lys He Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly 
225 230 235 240 

Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg 
260 265 270 

Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin Trp 
275 280 285 

His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin Val Ala 
290 . 295 300 

Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg Ala Arg 
305 310 315 320 

Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu 
325 330 335 

Ala Glu Thr Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 
355 360 365 

Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr 
370 375 380 

He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp Val Leu 
385 390 395 400 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu Gly Thr 
405 410 415 

Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He Lys Pro 
420 425 430 

He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala Pro Asp 
435 440 445 

Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin He He 
450 455 460 

Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser He 
465 470 475 480 

Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala Asp Phe 
485 490 495 

Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys Val Lys 
500 505 510 

Asp Ser Gly Ala Gly He Asn Pro Gin Asp He Pro Lys He Phe Thr 
515 520 525 
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Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly Gly Ser 
530 535 540 

Glv Leu Gly Leu Ala lie Ser Lys Arg Phe Val Asn Leu Met Glu Gly 
545 550 555 560 

Asn lie Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He 
565 570 575 

Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn Phe Thr 
595 600 605 

Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg Met Val 
610 615 620 

Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr Val Ser 
625 630 635 640 

Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys Val Val 
645 650 655 

Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He Ala Leu 
660 665 670 

Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu 
675 680 685 

Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 
690 695 700 

Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu Asp Asn 
705 710 715 720 

He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu Tyr Glu 
725 730 735 

Gly Met 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 188., 2401 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA AAGCTTCAAC 60 
GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CCCCGCGTCC CAAATCCCCA ATTCCTCCTC 120 
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TTCTCCGATC AATTCTTCCC AAGTGTGTGT ATGTGTGAGA GAGGAACTAT AGTGTAAAAA 180 

ATTCATA ATG GAA GTC TGC AAT TGT ATT GAA CCG CAA TGG CCA GCG GAT 229 
Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp 
15 10 

GAA TTG TTA ATG AAA TAC CAA TAC ATC TCC GAT TTC TTC ATT GCG ATT 277 
Glu Leu Leu Met Lys Tyr Gin Tyr lie Ser Asp Phe Phe lie Ala lie 
15 20 25 30 

GTG TAT TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 
Val Tyr Phe Ser lie Pro Leu Glu Leu lie Tyr Phe Val Lys Lys Ser 
35 40 45 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT TTT ATC 37 3 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe lie 
50 55 60 

GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG ACT TTC ACT ACG 421 
Val Leu Cys Gly Ala Thr His Leu lie Asn Leu Trp Thr Phe Thr Thr 
65 70 75 

CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT ACC GCG AAG GTG TTA ACC 469 
His Ser Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr 
80 85 90 

GCT GTT GTC TCG TGT GCT ACT GCG TTG ATG CTT GTT CAT ATT ATT CCT 517 
Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu Val His lie lie Pro 
95 100 105 110 

GAT CTT TTG AGT GTT TAG ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT" -- 565 
Asp Leu Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala 
115 120 125 

GCT GAG CTC GAT AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC 613 
Ala Glu Leu Asp Arg Glu Met Gly Leu lie Arg Thr Gin Glu Glu Thr 
130 135 140 

GGA AGG CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
Gly Arg His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 155 

AGA CAT ACT ATT TTA AAG ACT ACA CTT GTT GAG CTT GGT AGG ACA TTA 709 
Arg His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu 
160 165 170 

CCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA ACT GGG TTA GAG 757 
Ala Leu Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu 
175 180 185 190 

CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA CAT CCC GTG GAG TAT ACG 805 
Leu Gin Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr 
195 200 205 

GTT CCT ATT CAA TTA CCG GTG ATT AAC CAA GTG TTT GGT ACT AGT AGG 853 
Val Pro He Gin Leu Pro Val He Asn Gin Val Phe Gly Thr Ser Arg 
210 215 220 

GCT GTA AAA ATA TCT CCT AAT TCT CCT GTC GCT AGG TTG AGA CCT GTT 901 
Ala Val Lys H Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val 
225 230 235 
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TCT GGG AAA TAT ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT 949 
Ser Gly Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu 
240 245 250 

CTC CAC CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA AC A 997 
Leu His Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
255 260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT GCA AGG 1045 
Lvs Ara Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg 
y ^ ^ 275 280 285 

CAA TCG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC GTC GCT GAT CAG 1093 
Gin Trp His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin 
290 295 300 

GTC GCT GTA GCT CTC TCA CAT GCT GCG ATC CTA GAA GAG TCG ATC CGA 1141 
Val Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg 
305 310 315 

GCT AGG GAC CTT CTC ATC GAG CAG AAT GTT GCT CTT GAT CTA GCT AGA 1189. 
Ala Arg Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg 
320 325 330 

CGA GAA GCA GAA ACA GCA ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT 1237 
Arg Glu Ala Glu Thr Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val 
335 340 345 350 

ATC AAC CAT GAA ATC CGA ACA CCG ATC CAT GCG ATT ATT GCA CTC TCT 1285 
Met Asn His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser 
355 360 365 

TCC TTA CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTC ATC GTC 1333 
Ser Leu Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTC GCA ACT TTC ATC AAT GAT 1381 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp 
385 390 395 

GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT CAA CTT GAA CTT 1429 
Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu 
400 405 410 

GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA GAG GTC CTC AAT CTC ATA 1477 
Glv Thr Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He 
415 420 425 430 

AAG CCT ATA GCG GTT GTT AAG AAA TTA CCC ATC ACA CTA AAT CTT GCA 1525 
Lys Pro He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala 
435 440 445 

CCA GAT TTG CCA GAA TTT GTT GTT GGG GAT GAC AAA CGG CTA ATC CAG 1573 
Pro Asp Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin 
450 455 460 

ATA ATA TTA AAT ATA GTT GGT AAT GCT GTC AAA TTC TCC AAA CAA GGT 1621 
He He Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly 
465 470 475 

AGT ATC TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser He Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 
480 485 490 
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GAC TTT TTT GTC GTG CCA ACT GGG ACT CAT TTC TAC TTG AGA GTG AAG 1717 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys 
495 500 505 510 

GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC ATT CCA AAG ATT 17 65 
Val Lys Asp Ser Gly Ala Gly lie Asn Pro Gin Asp He Pro Lys He 
515 520 525 

TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA GCG ACG AGA AGC TCG GGT 1813 
Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly 
530 535 540 

GGT AGT GGG CTT GGC CTC GCC ATC TCC AAG AGG TTT GTG AAT CTG ATG 1861 
Gly Ser Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met 
545 550 555 

GAG GGT AAC ATT TGG ATT GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG 1909 
Glu Gly Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr 
560 565 570 

GCT ATC TTT GAT GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT 1957 
Ala He Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser 
575 580 585 590 

AAA CAG TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Lys Gin Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
595 600 605 

TTC ACT GGA CTT AAG GTT CTT GTC ATG GAT GAG AAC GGG GTA AGT AGA 2053 
Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg 
610 615 620 

ATG GTG ACG AAG GGA CTT CTT GTA CAC CTT GGG TGC GAA GTG ACC ACG- 2101 
Met Val Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr 
625 630 635 

GTG AGT TCA AAC GAG GAG TGT CTC CGA GTT GTG TCC CAT GAG CAC AAA 2149 
Val Ser Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys 
640 645 650 

GTG GTC TTC ATG GAC GTG TGC ATG CCC GGG GTC GAA AAC TAC CAA ATC 2197 
Val Val Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He 
655 660 665 670 

GCT CTC CGT ATT CAC GAG AAA TTC ACA AAA CAA CGC CAC CAA CGG CCA 2245 
Ala Leu Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro 
675 680 685 

CTA CTT GTG GCA CTC AGT CGT AAC ACT GAC AAA TCC ACA AAA GAG AAA 2293 
Leu lieu Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys 
690 695 700 

TGC ATG AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 2341 
Cys Met Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
705 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG GTA CTG 2389 
Asp Asn He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu 
720 725 730 



TAC GAG GGC ATG TAAAGGCGAT GGATGCCCCA TGCCCCAGAG GAGTAATTCC 2441 

Tyr Glu Gly Met 

735 
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GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG TTCTCTGGTT TAATTGTGTA 2501 
CATATCAGAG ATTGTCGGAG CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA 2561 
ATAGAAACTC TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 
GGTGGTATAA TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC CTTATATATG 2681 
TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG CTTGTTGCGT GCATGTATGA 2741 
CATTCATGCA GTATTATGGC GTCAGCTTTG CGCCGCTTAG TAGAAC 2787 



(2) INFORMATION FOR SEQ ID NO:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 738 amino acids 

(B) TYPE: amino acid 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp Glu Leu 
15 10 15 

Leu Met Lys Tyr Gin Tyr lie Ser Asp Phe Phe lie Ala He Val Tyr 
20 25 30 

Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val 
35 40 45 

Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val Leu 
50 55 60 

Cvs Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr His Ser 
65 70 75 80 

Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr Ala Val 
85 90 95 

Val Ser Cys Ala Thr Ala I#eu Met X*eu Val His He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His 
145 150 155 160 

Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala Leu 
165 170 175 

Glu Glu Cys Ala L u Trp Met Pro Thr Arg Thr Gly Leu Glu Leu Gin 
180 185 190 

Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr Val Pro 
195 200 205 



wo 95/01439 



PCT/US94/07418 



81 

lie Gin Leu Pro Val lie Asn Gin Val Phe Gly Thr Ser Arg Ala Val 
210 215 220 

Lys lie Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly 
225 230 235 240 

Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin lie Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg 
260 265 270 

Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin Trp 
275 280 285 

His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin Val Ala 
290 295 300 

Val Ala Leu Ser His Ala Ala lie Leu Glu Glu Ser Met Arg Ala Arg 
305 310 315 320 

Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu 
325 330 335 

Ala Glu Thr Ala lie Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thr Pro Met His Ala lie lie Ala Leu Ser Ser Leu 
355 360 365 

Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr 
370 375 380 

lie Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp Val Leu 
385 390 395 400 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu Gly Thr 
405 410 415 

Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu lie Lys Pro 
420 425 430 

lie Ala Val Val Lys Lys Leu Pro lie Thr Leu Asn Leu Ala Pro Asp 
435 440 445 

Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin lie lie 
450 455 460 

Leu Asn He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser He 
465 470 475 480 

Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala Asp Phe 
485 490 495 

Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys Val Lys 
500 505 510 

Asp Ser Gly Ala Gly He Asn Pro Gin Asp He Pro Lys He Phe Thr 
515 520 525 

Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly Gly Ser 
530 535 540 
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Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met Glu Gly 
545 550 555 560 

Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He 
565 570 575 

Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn Phe Thr 
595 600 €05 

Glv Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg Met Val 
610 615 €20 

Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr Val Ser 
625 630 €35 €40 

Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys Val Val 
645 650 655 

Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He Ala Leu 
660 665 €70 

Aro He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu 
675 680 685 

Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 
690 695 700 

Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu Asp Asn 
705 710 715 720 

He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu Tyr Glu 
725 730 735 

Gly Met 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TVPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS / 

(B) LOCATION: 188. ,2401 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA AAGCTTCAAC €0 

GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC CAAATCCCCA ATTCCTCCTC 120 

TTCTCCGATC AATTCTTCCC AAGTGTGTGT ATGTGTGAGA GAGGAACTAT AGTGTAAAAA 180 



wo 95/01439 



PCT/US94/07418 



83 

ATTCATA ATG GAA GTC TGC AAT TGT ATT GAA CCG CAA TGG CCA GCG GAT 229 
Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp 
15 10 

GAA TTG TTA ATG AAA TAC CAA TAC ATC TCC GAT TTC TTC ATT GCG ATT 277 
Glu Leu Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala He 
15 20 25 30 

GCG TAT TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 
Ala Tyr Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser 
35 40 45 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT TTT TTC 373 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe Phe 
50 55 60 

GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG ACT TTC ACT ACG 421 
Val Leu Cys Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr 
65 70 75 

CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT ACC GCG AAG GTG TTA ACC 4 69 
His Ser Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr 
80 85 90 

GCT GTT GTC TCG TGT GCT ACT GCG TTG ATG CTT GTT CAT ATT ATT CCT 517 
Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro 
95 100 105 110 

GAT CTT TTG AGT GTT AAG ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT 565 
Asp Leu Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala 
115 120 125 

GCT GAG CTC GAT AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC 613 
Ala Glu Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr - 
130 135 140 

GGA AGG CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
Gly Arg His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 ISO 155 

AGA CAT ACT ATT TTA AAG ACT ACA CTT GTT GAG CTT GGT AGG ACA TTA 709 
Arg His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu 
160 165 170 

GCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA ACT GGG TTA GAG 757 
Ala Leu Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu 
175 180 185 190 

CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA CAT CCC GTG GAG TAT ACG 805 
Leu Gin Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr 
195 200 205 

GTT CCT ATT CAA TTA CCG GTG ATT AAC CAA GTG TTT GGT ACT AGT AGG 853 
Val Pro He Gin Leu Pro Val He Asn Gin Val Phe Gly Thr Ser Arg 
210 215 220 

GCT GTA AAA ATA TCT CCT AAT TCT CCT GTG CCT AGG TTG AGA CCT GTT 901 
Ala Val Lys He Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val 
225 230 235 

TCT GGG AAA TAT ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT 949 
Ser Gly Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu 
240 245 250 
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cTc CAc err tct aat ttt cag att aat gac tgg cct gag CTT TCA AC a 997 
Leu His Leu Ser Asn Phe Gin lie Asn Asp Trp Pro Glu Leu Ser Thr 
255 260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT GCA AGG 1045 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg 

CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC GTC GCT GAT CAG 1093 
Gin Trp His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin 
290 295 300 

GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC CTA GAA GAG TCG ATG CGA 1141 
Val Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg 
305 310 315 

GCT AGG GAC CTT CTC ATG GAG CAG AAT GTT CCT CTT GAT CTA GCT AGA 1189 
Ala Ara Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg 
320 325 330 

CGA GAA GCA GAA ACA GCA ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT 1237 
Ara Glu Ala Glu Thr Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val 
335 340 345 350 

ATG AAC CAT GAA ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT 1285 
Met Asn His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser 
355 360 365 

TCC TTA CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 
Ser Leu Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG AAT GAT 1381 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp 
385 390 395 

GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT CAA CTT GAA CTT 1429 
Val Leu Asp Leu Ser Arg Leu Glu Asp Gly iJer Leu Gin Leu Glu Leu 
400 405 410 

GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA GAG GTC CTC AAT CTG ATA 1477 
Gly Thr Phe Asn Leu His Thr Leu Phe Arg Glu Val Leu Asn Leu He 
415 420 425 430 

AAG CCT ATA GCG GTT GTT AAG AAA TTA CCC ATC ACA CTA AAT CTT GCA 1525 
Lys Pro He Ala Val Val Lys Lys Leu Pro He Thr Leu Asn Leu Ala 
435 440 445 

CCA GAT TTG CCA GAA TTT GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG 1573 
Pro Asp Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin 
450 455 460 

ATA ATA TTA AAT ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT 1621 
He He llu J^n He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly 
465 470 475 

ACT ATC TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser He Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 
480 485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTG AGA GTG AAG 1717 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys 
495 500 505 510 



wo 95/01439 



PCT/US94/07418 



85 

GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC ATT CCA AAG ATT 17 65 
Val Lys Asp Ser Gly Ala Gly He Asn Pro Gin Asp He Pro Lys He 
515 520 525 

TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA GCG ACG AGA AGC TCG GGT 1813 
Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly 
530 535 540 

GGT AGT GGG CTT GGC CTC GCC ATC TCC AAG AGG TTT GTG AAT CTG ATC 1861 
Gly Ser Gly Leu Gly Leu Ala He Ser Lys Arg Phe Val Asn Leu Met 
545 550 555 

GAG GGT AAC ATT TGG ATT GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG 1909 
Glu Gly Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr 
560 565 570 

GCT ATC TTT GAT GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT 1957 
Ala He Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser 
575 580 585 590 

AAA CAG TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005- 
Lys Gin Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
595 600 605 

TTC ACT GGA CTT AAG GTT CTT GTC ATG GAT GAG AAC GGG GTA AGT AGA 2053 
Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg 
610 615 620 

ATG GTG ACG AAG GGA CTT CTT GTA CAC CTT GGG TGC GAA GTG ACC ACG 2101 
Met Val Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr 
625 630 635 

GTG AGT TCA AAC GAG GAG TGT CTC CGA GTT GTG TCC CAT GAG CAC AAA< - 2149 
Val Ser Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys 
640 645 650 

GTG GTC TTC ATG GAC GTG TGC ATG CCC GGG GTC GAA AAC TAC CAA A1*C 2197 
Val Val Phe Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He 
€55 660 665 670 

GCT CTC* CGT ATT CAC GAG AAA TTC ACA AAA CAA CGC CAC CAA CGG CCA 2245 
Ala Leu Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro 
675 680 685 

CTA CTT GTG GCA CTC AGT GGT AAC ACT GAC AAA TCC ACA AAA GAG AAA 2293 
Leu Leu Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys 
690 695 700 

TGC ATG AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 2341 
Cys Met Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
705 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG GTA CTG 2389 
Asp Asn He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu 
720 725 730 

TAC GAG GGC ATG TAAAGGCGAT CGATGCCCCA TGCCCCAGAG GAGTAATTCC 2441 

Tyr Glu Gly Met 

735 

GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG TTCTCTGGTT TAATTGTCTA 2501 
CATATCAGAG ATTGTCGGAG CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA 2561 
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ATAGAAACTC TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 
GGTGGTATAA TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC CTTATATATG 2681 
TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG CTTGTTGCGT GCATGTATGA 2741 
CATTGATCCA GTATTATGGC GTCAGCTTTG CGCCGCTTAG TAGAAC 2787 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 738 Mino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQX;ENCE DESCRIPTION: SEQ ID NO: 11: 

Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp Glu Leu 
1 5 10 15 

Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala He Ala Tyr 
20 25 30 

Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val 
35 40 45 

Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe Phe Val Leu 
50 55 60 

Cys Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Thr Thr His Ser 
65 70 75 80 

Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr Ala Val 
85 90 95 

Val Ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His 
145 150 155 160 

Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala Leu 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu Leu Gin 
180 185 190 

Leu Ser Tyr Thr Leu Arg His Gin His Pro Val Glu Tyr Thr Val Pro 
195 200 205 

He Gin Leu Pro Val He Asn Gin Val Ph Gly Thr Ser Arg Ala Val 
210 215 220 
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Lys lie Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly 
225 230 235 240 

Lys Tyr Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin lie Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg 
260 265 270 

Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin Trp 
275 280 285 

His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin Val Ala 
290 295 300 

Val Ala Leu Ser His Ala Ala lie Leu Glu Glu Ser Met Arg Ala Arg 
305 310 315 320 

Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu 
325 330 335 

Ala Glu Thr Ala lie Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thr Pro Met His Ala lie lie Ala Leu Ser Ser Leu 
355 360 365 

Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr 
370 375 380 

lie Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met Asn Asp Val Leu 
385 390 395 400 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Glu Leu Gly Thr 
405 410 415 

Phe Asn Leu- His Thr Leu Phe Arg Glu Val Leu Asn Leu lie Lys Pro 
420 425 430 

lie Ala Val Val Lys Lys Leu Pro lie Thr Leu Asn Leu Ala Pro Asp 
435 440 445 

Leu Pro Glu Phe Val Val Gly Asp Glu Lys Arg Leu Met Gin lie lie 
450 455 460 

Leu Asn lie Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser lie 
465 470 475 480 

Ser Val Thr Ala I*eu Val Thr Lys Ser Asp Thr Arg Ala Ala Asp Phe 
485 490 495 

Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg Val Lys Val Lys 
500 505 510 

Asp Ser Gly Ala Gly lie Asn Pro Gin Asp lie Pro Lys lie Phe Thr 
515 520 525 

Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr Arg Ser Ser Gly Gly Ser 
530 535 540 

Gly Leu Gly Leu Ala lie Ser Lys Arg Phe Val Asn Leu M t Glu Gly 
545 550 555 560 
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Asn He Trp He Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He 
565 570 575 

Phe Asp Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn Phe Thr 
595 600 605 

Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg Met Val 
610 615 620 

Thr Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr Val Ser 
625 630 635 640 

Ser Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys Val Val 
645 650 655 

Phe Met Asp Val" Cys Met Pro Gly Val Glu Asn Tyr Gin He Ala Leu 
660 665 670 

Arg He His Glu Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu 
675 680 685 

Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 
690 695 700 

Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu Asp Asn 
705 710 715 720 

He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg Val Leu Tyr Glu 
725 730 735 

Gly Met 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 amino acids 

(B) TYPE: amino acid . 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu Ala Glu Thr Ala He 
15 10 15 

Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu Met Arg Thr 
20 25 30 

Pro Met His Ala He He Ala Leu Ser Ser Leu Leu Gin Glu Thr Glu 
35 40 45 

Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr He Leu Lys Ser Ser 
50 55 60 

Asn Leu Leu Ala Thr Leu M t Asn Asp Val Leu Asp L u Ser Arg L u 
65 70 75 80 
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Glu Asp Gly Ser Leu Gin Leu Glu Leu Gly Thr Phe Asn Leu His Thr 
85 90 95 

Leu Phe Arg Glu Val Leu Asn Leu lie Lys Pro lie Ala Val Val Lv<: 
100 105 110 ^ 

Lys Leu Pro lie Thr Leu Asn Leu Ala Pro Asp Leu Pro Glu Phe Val 
115 120 125 

Val Gly Asp Glu Lys Arg Leu Met Gin lie lie Leu Asn lie Val Glv 
130 135 140 ^ 

Asn Ala Val Lys Phe Ser Lys Gin Gly Ser lie 
145 150 155 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEI^GTH: 155 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gin Asn Val Glu Leu Asp Leu Ala Lys Lys Arg Ala Gin Glu Ala Ala 
1 5 10 15 

Arg He Lys Ser Glu Phe Leu Ala Asn Met Ser His Glu Leu Ara Thr 
20 25 30 

Pro Leu Asn Gly Val He Gly Phe Thr Arg Leu Thr Leu Lys Thr Glu 
35 40 45 

Leu Thr Pro Thr Gin Arg Asp His Leu Asn Thr He Glu Arg Ser Ala 
50 55 5Q 

Asn Asn Leu Leu Ala He He Asn Asp Val Leu Asp Phe Ser Lys Leu 
^5 70 75 gQ 

Glu Ala Gly Lys Leu He Leu Glu Ser He Pro Phe Pro Leu Arg Ser 
85 90 95 

Thr Leu Asp Glu Val Val Thr Leu Leu Ala His Ser Ser His Asp Lvs 
100 105 110 ^ 

Gly Leu Glu Leu Thr Leu Asn He Lys Ser Asp Val Pro Asp Asn Val 
115 120 125 

He Gly Asp Pro Leu Arg Leu Gin Gin He He Thr Asn Leu Val Gly 
130 135 140 

Asn Ala He Lys Phe Thr Glu Asn Gly Asn He 
145 150 155 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 amino acids 

(B) TYPE: aunino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Gin Asn lie Glu Leu Asp Leu Ala Arg Lys Glu Ala Leu Glu Ala Ser 
c 10 



5 10 

Arg He Lys Ser Glu Phe Leu Ala Asn Met Ser His Glu lie Arg Thr 
20 25 30 

Pro Leu Asn Gly He Leu Gly Phe Thr His Leu Leu Gin Lys Ser Glu 
35 40 45 

Leu Thr Pro Arg Gin Phe Asp Tyr Leu Gly Thr lie Glu Lys Ser Ala 
50 55 60 

Asp Asn Leu Leu Ser He He Asn Glu He Leu Asp Phe Ser Lys He 
65 70 75 80 

Glu Ala Gly Lys Leu Val Leu Asp Asn He Pro Phe Asn Leu Arg Asp 
85 90 95 

Leu Leu Gin Asp Thr Leu Thr He Leu Ala Pro Ala Ala His Ala Lys 
100 105 110 

Gin Leu Glu Leu Val Ser Leu Val Tyr Arg Asp Thr Pro Leu Ala Leu 
115 120 125 

Ser Gly Asp Pro Leu Arg Leu Arg Gin He Leu Thr Asn Leu Val Ser 
130 135 140 

Asn Ala He Lys Phe Thr Arg Glu Gly Thr He 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 149 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Ara Ala Val Arg Glu Ala Arg His Ala Asn Gin Ala Lys Ser Arg Phe 
1 5 10 

Leu Ala Asn Met S r His Glu Phe Arg Thr Pro Leu Asn Gly Leu Ser 
20 25 30 
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Gly Met Thr Glu Val Leu Ala Thr Thr Arg Leu Asp Ala Glu Gin Lys 
35 40 45 

Glu Cys Leu Asn Thr lie Gin Ala Ser Ala Arg Ser Leu Leu Ser Leu 
50 55 60 

Val Glu Glu Val Leu Asp lie Ser Ala lie Glu Ala Gly Lys lie Ara 
€5 70 75 80 

lie Asp Arg Arg Asp Phe Ser Leu Arg Glu Met He Gly Ser Val Asn 
85 90 95 

Leu He Leu Gin Pro Gin Ala Arg Gly Arg Arg Leu Glu Tvr Glv Thr 
100 105 110 

Gin Val Ala Asp Asp Val Pro Asp Leu Leu Lys Gly Asp Thr Ala His 
115 120 125 

Leu Arg Gin Val Leu Leu Asn Leu Val Gly Asn Ala Val Lys Phe Thr 
130 135 140 

Glu His Gly His Val 
145 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) liENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Leu Lys Val Leu Val Met Asp Glu Asn Gly Val Ser Arg Met Val Thr 
1 5 10 15 

Lys Gly Leu Leu Val His Leu Gly Cys Glu Val Thr Thr Val Ser Ser 
20 25 30 

Asn Glu Glu Cys Leu Arg Val Val Ser His Glu His Lys Val Val Phe 
35 40 45 

Met Asp Val Cys Met Pro Gly Val Glu Asn Tyr Gin He Ala Leu Ara 
50 55 60 

He His 
65 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 eunino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 17 : 

Leu Arg Val Leu Val Val Asp Asp His Lys Pro Asn Leu Met Leu Leu 
15 10 15 

Arg Gin Gin Leu Asp T/r Leu Gly Gin Arg Val Val Ala Ala Asp Ser 
20 25 30 

Gly Glu Ala Ala Leu Ala Leu Trp His Glu His Ala Phe Asp Val Val 
35 . 40 45 

He Thr Asp Cys Asn Met Pro Gly He Asn Gly Tyr Glu Leu Ala Arg 
50 55 60 

Arg He Arg 
65 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQXraaCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Met He Leu Val Val Asp Asp His Pro He Asn Arg Arg Leu Leu 
1 5 10 15 

Ala Asp Gin Leu Gly Ser Leu Gly Tyr Gin Cys Lys Thr Ala Asn Asp 
20 25 30 

Gly Val Asp Ala Leu Asn Val Leu Ser Lys Asn His He Asp He Val 
35 40 45 

Leu Ser Asp Val Asn Met Pro Asn Met Asp Gly Tyr Arg Leu Thr Gin 
50 *55 60 

Arg He Arg 
65 

(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 eunino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECXn-E TYPE: peptide 



(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
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Pro Arg Val Leu Cys Val Asp Asp Asn Pro Ala Asn Leu Leu Leu Val 
15 10 15 

Gin Thr Leu Leu Glu Asp Met Gly Ala Glu Val Val Ala Val Glu Gly 
20 25 30 

Gly Tyr Ala Ala Val Asn Ala Val Gin Gin Glu Ala Phe Asp Leu Val 
35 40 45 

Leu Met Asp Val Gin Met Pro Gly Met Asp Gly Arg Gin Ala Thr Glu 
50 55 60 

Ala He Arg 
65 



(2) INFORMATION FOR SEQ ID NO: 20: 



(2) INFORMATION FOR SEQ ID NO: 21: 



60 
120 
180 
240 
300 
360 
369 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
ATGGAATCCT GTGATTGCAT TGAGGCTTTA CTGCCAACTG GTGACCTGCT GGTTAAATAC 
CAATACCTCT CAGATTTCTT CATTGCTGTA GCCTACTTTT CCATTCCGTT GGAGCTTATT 
TATTTTGTCC ACAAATCTGC ATGCTTCCCA TACAGATGGG TCCTCATGCA ATTTGGTGCT 
TTTATTGTGC TCTGCGGAGC AACACACTTT ATTAGCTTGT GGACCTTCTT TATGCACTCT 
AAGACGGTCG CTGTGGTTAT GACCATATCA AAAATGTTGA CAGCTGCCGT GTCCTCTATC 
ACAGCTTTGA TGCTTGTTCA CATTATTCCT GATTTGCTAA GTGTTAAAAC GCGAGAGTTG 
TTCTTGAAA 



( i ) - SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATGGAAGTCT GCAATTGTAT TGAACCGCAA TGGCCAGCGG ATGAATTGTT AATGAAATAC 60 

CAATACATCT CCGATTTCTT CATTGCGATT GCGTAnTTT CGATTCCTCT TGAGTTGATT 120 

TACTTTCTGA AGAAATCAGC CGTGTTTCCG TATAGATGGG TACTTGTTCA GTTTGGTGCT 180 
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TTTATCGrrC TTTGTGGAGC AACTCATCTT ATTAACTTAT GGACTTTCAC TACGCATTCG 240 
AGAACCGTGG CGCTTGTGAT GACTACCGCG AAGGTGTTAA CCGCTGTTGT CTCGTGTGCT 300 
ACTGCGTTGA TGCTTGTTCA TATTATTCCT GATCTTTTGA GTGTTAAGAC TCGGGAGCTT 360 
TTCTTGAAA 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 296 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

GCTCTTTCAC ATGCTGCAAT TTTAGAAGAT TCCATGCGAG CCCATGATCA GCTCATGGAA 60 

CAGAATATTG CTTTGGATGT AGCTCGACAA GAAGCAGAGA TGGCCATCCG TGCACGTAAC 120 

GACTTCCTTG CTGTGATGAA CCATGAAATG AGAACGCCCA TGCATGCAGT TATTGCTCTG 180 

TGCTCTCTGC TTTTAGAAAC AGACTTAACT CCAGAGCAGA GAGTTATGAT TGAGACCATA 240 

TTGAAGAGCA GCAATCTTCT TGCAACACTG ATAAATGATG TTCTAGATCT TTCTAG 296 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 296 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GCTCTCTCAC ATGCTGCGAT CCTAGAAGAG TCGATGCGAG CTAGGGACCT TCTCATGGAG 60 

CAGAATGTTG CTCTTCATCT AGCTAGACGA GAAGCAGAAA CAGCAATCCG TGCCCGCAAT 120 

GAnrCCTAG CGGTTATGAA CCATGAAATG CGAACACCGA TGCATGCGAT TATTGCACTC 180 

TCTTCCTTAC TCCAAGAAAC GGAACTAACC CCTGAACAAA GACTGATGGT GGAAACAATA 240 

CTTAAAAGTA GTAACCTTTT GGCAACTTTG ATGAATGATG TCTTAGATCT TTCAAG 296 
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(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 123 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Met Glu Ser Cys Asp Cys lie Glu Ala Leu Leu Pro Thr Gly Asp Leu 
15 10 15 

Leu Val Lys Tyr Gin Tyr Leu Ser Asp Phe Phe lie Ala Val Ala Tyr 
20 25 30 

Phe Ser lie Pro Leu Glu Leu lie Tyr Phe Val His Lys Ser Ala Cys 
35 40 45 

Phe Pro Tyr Arg Trp Val Leu Met Gin Phe Gly Ala Phe lie Val Leu 
50 55 60 

Cys Gly Ala Thr His Phe lie Ser Leu Trp Thr Phe Phe Met His Ser 
65 70 75 80 

Lys Thr Val Ala Val Val Met Thr lie Ser Lys Met Leu Thr Ala Ala' 
85 90 95 

. Val Ser Cys lie Thr Ala Leu Met Leu Val His lie lie Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys 
115 120 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 aiDino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Glu Val Cys Asn Cys lie Glu Pro Gin Trp Pro Ala Asp Glu Leu 
15 10 15 

Leu Met Lys Tyr Gin Tyr lie Ser Asp Phe Phe lie Ala lie Ala Tyr 
20 25 30 

Phe Ser lie Pro Leu Glu Leu lie Tyr Phe Val Lys Lys Ser Ala Val 
35 40 45 

Phe Pro Tyr Arg Trp Val L u Val Gin Phe Gly Ala Phe He Val Leu 
50 55 60 



wo 95/01439 



PCT/US94/07418 



96 

Cys Gly Ala Thr His Leu lie Asn Leu Trp Thr Phe Thr Thr His Ser 
65 70 75 80 

Arg Thr Val Ala Leu Val Met Thr Thr Ala Lys Val Leu Thr Ala Val 
85 90 95 

Val Ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys 
115 120 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TOPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
AGTAAGAACG AAGAAGAAGT G 21 

<2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: ajnino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECXH-E TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Leu Arg Val Lys Val Lys Asp Ser Gly Ala Gly lie Asn Pro Gin Asp 
15 10 15 

lie Pro Lys lie Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu Ala Thr 
20 25 30 

Arg Ser Ser Gly Gly Ser Gly Leu Gly Leu Ala lie Ser Lys Arg Phe 
35 40 45 

Val Asn Leu Met Glu Gly Asn lie 
50 55 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LiENGTH: 56 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECXn^ TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

He Glu Val Gin He Arg Asp Thr Gly He Gly He Pro Glu Arg Asp 
1 5 10 15 

Gin Ser Arg Leu Phe Gin Ala Phe Arg Gin Ala Asp Ala Ser He Ser 
20 25 30 

Arg Arg His Gly Gly Thr Gly Leu Gly Leu Val He Thr Gin Lys Leu 
35 40 45 

Val Asn Glu Met Gly Gly Asp He 
50 55 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

Leu Arg He Ser Val Gin Asp Thr Gly He Gly Leu Ser Ser Gin Aso 
1 5 10 15 _ 

Val Arg Ala Leu Phe Gin Ala Phe Ser Gin Ala Asp Asn Ser Leu Ser 
20 25 30 

Arg Gin Pro Gly Gly Thr Gly Leu Gly Leu Val He Ser Lys Arg Leu 
35 40 45 

He Glu Gin Met Gly Gly Glu He 
50 55 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Leu Arg Phe Asp Val Glu Asp Thr Gly He Gly Val Pro Met Asp Met 
15 10 15 

Arg Pro Arg Leu Phe Glu Ala Phe Glu Gin Ala Asp Val Gly Leu Ser 
20 25 30 

Arg Arg Tyr Glu Gly Thr Gly Leu Gly Thr Thr He Ala Lys Gly Leu 
35 40 45 

Val Glu Ala Met Gly Gly Ser II 
50 55 
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(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Pro Leu Leu Val Ala Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu 
15 10 15 

Lys Cys Met Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser 
20 25 30 

Leu Asp Asn lie Arg Asp Val Leu Ser Asp Leu Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: aunino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

Cys lie Leu Phe Gly Phe Thr Ala Ser Ala Gin Met Asp Glu Ala His 
1 5 10 . 15 

Ala Cys Arg Ala Ala Gly Met Asp Asp Cys Leu Phe Lys Pro lie Gly 
20 25 30 

Val Asp Ala Leu Arg Gin Arg Leu Asn Glu Ala Ala 
35 40 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Leu Pro Val lie Gly Val Thr Ala Asn Ala Leu Ala Glu Glu Lys Gin 
15 10 15 

Arg Cys Leu Glu Ser Gly Met Asp Ser Cys Leu Ser Lys Pro Val Thr 
20 25 30 

Leu Asp Val lie Lys Gin Ser L u Thr Leu Tyr Ala 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Leu Pro He Val Ala Leu Thr Ala His Ala Met Ala Asn Glu Lys Arg 

Ser Leu Leu Gin Ser Gly Met Asp Asp Tyr Leu Thr Lys Pro He Ser 
20 25 30 

Glu Arg Gin Leu Ala Gin Val Val Leu Lys Trp Thr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2405 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 288.. 2196 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TTTTTTTTTT GTCAAAAGCT CGATGTAAAA ATCCGATGGC CACAAGCAAA ACGACAGGTT €0 

CCAACTTCAC GGAGATTGTG AAAATGGAGT AGTAGTTCAG TGAAGTAGTA GATACTGAGA 120 

TCGCATTCTC CGGCGTCGTT TTTCACATCG AAATAGTCGT GTAAAAAAAT GAAAAAATTC 180 

CTGCGAGACA GGTATGTGTC GCAGCAGGAA ATAGCATCTT AAAGGAAGGA AGGAAGGAAA 240 

CTCGAAAGTT ACTAAAAATT TTTGATTCTT TGGGACGAAA CGAGATA ATC GAA TCC 296 

Met Glu Ser 
1 

TGT GAT TGC ATT GAG GCT TTA CTG CCA ACT GGT GAC CTG CTG GTT AAA 344 
Cys Asp Cys lie Glu Ala Leu Leu Pro Thr Gly Asp Leu Leu Val Lys 
5 10 15 

TAC CAA TAC CTC TCA GAT TTC TTC ATT GCT GTA GCC TAC TTT TCC ATT 392 
Tyr Gin Tyr Leu Ser Asp Phe Phe He Ala Val Ala Tyr Phe Ser He 
20 25 30 35 

CCG TTG GAG CTT ATT TAT TTT GTC CAC AAA TCT GCA TGC TTC CCA TAC 440 
Pro Leu Glu Leu He Tyr Phe Val His Lys Ser Ala Cys Phe Pro Tyr 
40 45 50 
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AGA TCG GTC CTC ATC CAA TTT GGT OCT TTT ATT GTG CTC TGT GGA GCA 488 
Arg tS> Val Leu Met Gin Phe Gly Ala Phe He Val Leu Cys Gly Ala 
55 60 

ACA CAC TIT ATT AGC TTG TGG ACC TTC TTT ATG CAC TCT AAG ACG GTC 536 
JhJ His Phe lie Ser Leu Trp Thr Phe Phe Met His Ser Lys Thr Val 
70 75 80 

GCT GTG GTT ATG ACC ATA TCA AAA ATG TTG ACA GCT GCC GTG TCC TGT 584 
Ala val val Met Thr He Ser Lys Met Leu Thr Ala Ala Val Ser Cys 
85 90 95 

ATC ACA GCT TTO ATC CTT GTT CAC ATT ATT CCT GAT TTG CTA AGT GTT 632 
Yie Thr Ala Leu Met Leu Val His He He Pro Asp Leu Leu Ser Val 
100 105 11^ 

AAA ACG CGA GAG TTG TTC TTG AAA ACT CGA GCT GAA GAG CTT GAC AAG 
?£r Sg gIu Leu Phe Leu Lys Thr Arg Ala Glu Glu Leu Asp Lys 
120 125 130 



680 



GAA ATG GGC CTA ATA ATA AGA CAA GAA GAA ACT GGC AGA CAT GTC AOJ 728 
Glu Met Gly Leu He He Arg Gin Glu Glu Thr Gly Arg His Val Arg 
135 140 145 

ATG CTG ACT CAT GAG ATA AGA AGC ACA CTC GAC AGA CAC ACA ATC TTG 776 
Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His Thr He, Leu 
150 155 loO 

AAG ACT ACT CTT GTG GAG CTA GGT AGG ACC TTA GAC CTG GCA GAA TGT 824 
Lvs Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Asp Leu Ala Glu Cys 
165 170 175 

GCT TTG TCG ATC CCA TCC CAA GGA GGC CTC ACT TTC CAA CTT TCC CAT 872 
Ala Leu Trp Met Pro Cys Gin Gly Gly Leu Thr Leu Gin Leu Ser His 
180 185 190 195 

AAT TTA AAC AAT CTA ATA CCT CTC GGA TCT ACT GTC CCA ATT AAT CTT 920 
^n Hu ^n ^n Leu He Pro Leu Gly Ser Thr Val Pro He Asn Leu 
200 205 210 

CCT ATT ATC AAT GAA ATT TTT AGT AGC CCT GAA GCA ATA CAA ATT CCA 968 
Pro He He Asn Glu He Phe Ser Ser Pro Glu Ala He Gin He Pro 
215 220 225 

CAT ACA AAT CCT TTC GCA AGG ATC AGG AAT ACT GTT GGT AGA TAT ATT 1016 
Sis iSr Asn Pro Leu Ala Arg Met Arg Asn Thr Val Gly Arg TVr He 
230 235 240 

CCA CCA GAA GTA GTT GCT GTT CGT GTA CCG CTT TTA CAC CTC TCA AAT 1064 
Pro Pro Glu Val Val Ala Val Arg Val Pro Leu Leu His Leu Ser Asn 
245 250 255 

TTT ACT AAT GAC TGG GCT GAA CTC TCT ACT AGA AGT TAT GCG GTT ATC 1112 
T^e ^n Ala Glu Leu Ser Thr Arg Ser Tyr Ala Val Met 

260 265 270 ^/s 

GTT CTC GTT CTC CCG ATC AAT GGC TTA AGA AAG TGG CGT GAA CAT GAG 1160 
Val Leu Val Leu Pro Met Asn Gly Leu Arg Lys Trp Arg Glu Hxs Glu 
280 285 290 



1208 
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CAT GCT GCA ATT TTA GAA GAT TCC ATG CGA GCC CAT GAT CAG CTC ATG 12 5 6 
His Ala Ala lie Leu Glu Asp Ser Met Arg Ala His Asp Gin Leu Met 
310 315 320 

GAA CAG AAT ATT GCT TIG GAT GTA GCT CGA CAA GAA GCA GAG ATG GCC 1304 
Glu Gin Asn He Ala Leu Asp Val Ala Arg Gin Glu Ala Glu Met Ala 
325 330 335 

ATC CGT GCA CGT AAC GAC TTC CTT GCT GTG ATG AAC CAT GAA ATG AGA 1352 
lie Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu Met Ara 
340 345 350 

ACG CCC ATG CAT GCA GTT ATT GCT CTG TGC TCT CTG CTT TTA GAA ACA unn 
Thr Pro Met His Ala Val He Ala Leu Cys Ser Leu Leu Leu Glu Thr 
360 365 

GAC TTA ACT CCA GAG CAG AGA CTT ATG ATT GAG ACC ATA TTO AAG AGC Id da 
Asp Leu Thr Pro Glu Gin Arg Val Met He Glu Thr He Leu Lys Ser 
375 380 385 

AGC AAT CTT CTT GCA ACA CTG ATA AAT GAT GTT CTA GAT CTT TCT AGA mqc 
Ser Asn Leu Leu Ala Thr Leu He Asn Asp Val Leu Asp Leu Ser jG:a 
390 395 400 

CTT GAA GAT GGT ATT CTT GAA CTA GAA AAC GGA ACA TTC AAT CTT CAT 15d4 
Leu Glu Asp Gly He Leu Glu Leu Glu Asn Gly Thr Phe Asn Leu His 
405 410 415 

GGC ATC TTA AGA GAG GCC GTT AAT TTG ATA AAG CCA ATT GCA TCT TTG 1592 
Gly He Leu Arg Glu Ala Val Asn Leu He Lys Pro He Ala Ser Leu 
420 425 430 435 - 

AAG AAA TTA TCT ATA ACT CTT GCT TTG GCT CTG GAT TTA CCT ATT CTT 1640 
Lys Lys Leu Ser He Thr Leu Ala Leu Ala Leu Asp Leu Pro He Leu 

445 450 

^ "t^*^ '^5,'^ ^ ^ ACT CTC TTA AAC GTG GTG 1688 

Ala Val Gly Asp Ala Lys Arg Leu He Gin Thr Leu Leu Asn Val V&i 
455 460 465 

GGA AAT GCT GTG AAG TTC ACT AAA GAA GGA CAT ATT TCA ATT GAG GCT 1736 
Gly Asn Ala Val Lys Phe Thr Lys Glu Gly His He Ser He Glu Ala 
4"0 475 480 

Ifr ^ i^*^ ^ o*^ ^? ^^"^ <=AT CCT CCT GAA ATC 1784 

Ser Val Ala Lys Pro Glu lyr Ala Arg Asp cys His Pro Pro Glu Met 
485 490 495 

TTC CCT ATC CCA AGT GAT OGC CAG ITT TAT TTC CGT GTC CAG GTT AGA 1832 
Phe Pro Met Pro Ser Asp Gly Gin Phe Tyv Leu Arg Val Gin Val jSrg 
500 505 510 515 

GAT ACT GGG TGT GGA ATT AGC CCA CAA GAT ATA CCA CTA GTA TTC ACC 1880 
Asp Thr Gly Cys Gly He Ser Pro Gin Asp He Pro Leu ValSke Kir 
520 525 530 

AAA TTT GCA GAG TCA CGG CCT ACG TCA AAT CGA AGT ACT CGA GGG GAA 1928 
Lys Phe Ala Glu Ser Arg Pro Thr Ser Asn Arg Ser Thr Gly Gly clu 
535 540 545 

GGT CTA GGG CTT GCC ATT TGG AGA CGA TTT ATT CAA CTT ATC AAA CGT 1976 
Gly Leu Gly Leu Ala He Trp Arg Arg Phe He Gin Leu Met Lys Glv 
550 555 560 
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565 570 

T-T-r r-Ti. rTT AAA CTC GGA ATC TGT CAC CAT CCA AAT GCA TTA CCT CTG 2072 
We vTl vTi ^Si Leu'Sy'^ne Cys His His Pro Asn Ala Leu Pro Leu 
580 585 

CTA CCT ATG CCT CCC AGA GGC AGA TTG AAC AAA GGT AGC GAT GAT CTC 2120 
S5 %lo P?o Pro Arg Gly Arg Leu Asn Lys Gly Ser Asp Asp Leu 

600 

rrrr agG TAT AGA CAG TTC CGT GGA GAT GAT GGT GGG ATG TCT GTG AAT 2168 
5S ^9 gTh Vhe Arg Gly Asp Asp Gly Gly Met Ser Val Asn 

615 

GCT CAA CGC TAT CAA AGA AGT ATG TAA A TGACAAAAGG ACATTGGTGT 2216 
Ala Gin Arg Tyr Gin Arg Ser Met * 
630 

GACAAAGAAC ATTAAATCAT GACTAGTGAA mGAGATTT CTTCACTGrT CTGTACACTC 227 6 
CAAATCGCAC AGrTIGTCrT GTAACTAACC TAATTCAATG CTCGTAAAGT GAGTACTGGA 233 6 
GTATCTIGAA AATCTAACTA TCGAATTTAT ACATCGAGCT TTTGACAAAA AAAAAAAAAA 2396 

2405 

AAAAAAAAA 

(2) IKFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 636 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Met Glu Ser Cys Asp Cys He Glu Ala Leu Leu Pro Thr Gly Asp Leu 
1 5 10 15 

Leu Val Lys Tyr Gin Tyr Leu Ser Asp Phe Phe He Ala Val Ala Tyr 
Phe ser He Pro Leu Glu Leu He Tyr Phe Val His Lys Ser Ala Cys 



35 



Phe Pro Tyr Arg Trp Val Leu Met Gin Phe Gly Ala Phe He Val Leu 

55 60 



50 



cys Gly Ala Thr His Phe He Ser Leu Trp Thr Phe Phe Met His Ser 
65 ^ 

Lys Thr Val Ala Val Val Met Thr He Ser Lys Met Leu Thr Ala Ala 
^ 85 

Val Ser Cys II Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 

Leu ser Val Lys Thr Arg Glu Leu Phe Leu Lys Thr Arg Ala Glu Glu 
115 120 i^=> 
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Leu Asp Lys Glu Met Gly Leu He He Arg Gin Glu Glu Thr Glv Aro 
130 135 140 

His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Ara His 

150 155 160 

Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Asp- Leu 
165 170 175 

Ala Glu Cys Ala Leu Trp Met Pro Cys Gin Gly Gly Leu Thr Leu Gin 
180 185 190 

Leu Ser His Asn Leu Asn Asn Leu He Pro Leu Gly Ser Thr Val Pro 
195 200 205 

He Asn Leu Pro He He Asn Glu He Phe Ser Ser Pro Glu Ala He 
210 215 220 

Gin He Pro His Thr Asn Pro Leu Ala Arg Met Arg Asn Thr Val Glv 
225 230 235 240 

Arg Tyx He Pro Pro Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Thr Asn Asp Trp Ala Glu Leu Ser Thr Aro Ser Tvr 
260 265 270 

Ala Val Met Val Leu Val Leu Pro Met Asn Gly Leu Arg Lys Trp Arg 

Glu His Glu Leu Glu Leu Val Gin Val Val Ala Asp Gin Val Ala Val 
290 295 300 

Ala Leu Ser His Ala Ala He Leu Glu Asp Ser Met Arg Ala His Asp 
305 310 315 320 

Gin Leu Met Glu Gin Asn He Ala Leu Asp Val Ala Arg Gin Glu Ala 
325 330 335 

Glu Met Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His 
340 345 350 

Glu Met Arg Thr Pro Met His Ala Val He Ala Leu Cys Ser Leu Leu 
355 360 365 

Leu Glu Thr Asp Leu Thr Pro Glu Gin Arg Val Met He Glu Thr He 
370 375 380 

Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu He Asn Asp Val Leu Aso 

390 395 400 

Leu Ser Arg Leu Glu Asp Gly He Leu Glu Leu Glu Asn Gly Thr Phe 
405 410 415 

Asn Leu His Gly He Leu Arg Glu Ala Val Asn Leu He Lys Pro He 
420 425 430 

Ala Ser Leu Lys Lys Leu Ser He Thr Leu Ala Leu Ala Leu Asp Leu 
435 440 445 

Pro He Leu Ala Val Gly Asp Ala Lys Arg Leu II Gin Thr Leu Leu 
450 455 460 
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Asn Val Val Gly Asn Ala Val Lys Phe Thr Lys Glu Gly His He Ser 
465 470 475 480 

He Glu Ala Ser Val Ala Lys Pro Glu Tyr Ala Arg Asp Cys His Pro 
485 490 495 

Pro Glu Met Phe Pro Met Pro Ser Asp Gly Gin Phe Tyr Leu Arg Val 
505 510 



500 



Gin Val Arg Asp Thr Gly Cys Gly He Ser Pro Gin Asp He Pro Leu 
515 520 525 

Val Phe Thr Lys Phe Ala Glu Ser Arg Pro Thr Ser Asn Arg Ser Thr 
530 535 540 

Gly Gly Glu Gly Leu Gly Leu Ala He Trp Arg Arg Phe He Gin Leu 
545 550 555 560 

Met Lys Gly Asn He Trp He Glu Ser Glu Gly Pro Gly Lys Gly Thr 
565 570 575 

Thr Val Thr Phe Val Val Lys Leu Gly He Cys His His Pro Asn Ala 
580 585 590 

Leu Pro Leu Leu Pro Met Pro Pro Arg Gly Arg Leu Asn Lys Gly Ser 
595 600 605 

Asp Asp Leu Phe Arg Tyr Arg Gin Phe Arg Gly Asp Asp Gly Gly Met 
610 615 620 

Ser Val Asn Ala Gin Arg Tyr Gin Arg Ser Met * 
625 630 635 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4566 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join (763 1671 , 3062.. 3433, 3572.. 3838, 3969 

..4096, 4234.-4402) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

AGATCTGGTA CTACCAAAAG GTATCCAATT AATCCATGCT TGGCCTCCCA TTACAATGCC 60 

TGTAAGAAAT AATTGTTCTT TCCACCTCCA CAACTAATTG TCGAACTATT ATATCTATCT 120 

TrATTCCCTT AAATGTX3AAA CGAATTACAC AGACTATTTG GCGCTACTTT TTTCCTAGAT 180 

ATATTGAAGA CCTAGTTTCT TATATTTGTG GGAAGCATTT GGAAGTTCTA TAAGAACTAT 240 

ATCATGTTCG AAAACATTCT TATAATTTTC GACAAGATTG CTGAAGGAGT GTCTTATCTT 300 

TTATGTATTC TTGACTAGAG GAGTTTAATA AAAAGAAAAT AGAAAGGAAC AAAGAAACGT 360 
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ACAAGTGTAT AAAAGGAGTT GGGGCAAAGA CATCAGAAAC ATTTAGACCT ACGATTTCAT 420 

CCTACATGTT ATGGTTTTAG TTCGTTAGAG GTTTTAACAT ATTAAATCAG CAAAGTTCTG 4 80 

ACATACATAA AGTGCATAAC ATAAAGATGA AATTCACAAT TTCCTCGATC TTTTCGTCCA 540 

AGGGAACTAT TTTTTACACT ATAAGTTAGC TGTTAATTTC AATATTGGCT CTTCTACACC 600 

TTGTTGTTCT TGAGTATAAT TCTATTTTGC ATCAAACATA TGTCAGAACT TATCCTC 660 
TTAAATATAT TCAGGTTGTT TAACTCTTGT ACAGCTTGTT ATTCTTCTCA GGTCTATTTC 720 

CTTCTCCTTA TTTGCTAACT TGTGCTCCAG TTATCTTCCA TC GTG GAG TCA TCT 774 

Val Glu Ser Cys 

AAC TGC ATC ATT GAC CCA CAG TTG CCT GCT GAC GAC TTC CTA ATC AAG 855 
Asn Cys He He Asp Pro Gin Leu Pro Ala Asp Asp iSu Leu iSt^ys 
5 10 15 lo 

TAT CAG TAC ATT TCT GAT TTT TTC ATA GCA CTT GCT TAT TTC TCC ATT B7n 
Tyr Gin Tyr He Ser Asp Phe Phe He Ala Leu Ala Tyi Phe Ser lie 
25 30 35 

CCA GTG GAG TTG ATA TAC TTC GTT AAG AAG TCT GCT GTC TIT CCA TAT 918 
Pro Val Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val^e Pro^r 
40 45 SQ "° '^y^ 

AGA TCG GTT CTT GTG CAG TTC GGT GCT TTC ATA GTT CTT TCT GGA GCA Sfifi 
Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val Leu^s Gly^la 
55 60 65 

ACC CAT CTT ATC AAC TTA TGG ACA TTT AAT ATG CAT ACA AGG AAT GTG i ni ^ 
Thr His Leu He Asn Leu Trp Thr Phe Asn Met hIs Thr Sg^sn^^ " 

75 80 

GCA ATA GTA ATG ACT ACT GCA AAG GCC TTC ACT GCA CTC GTC TCA TCT lOfi? 
Ala He Val Met Thr Thr Ala Lys Ala Leu Thr Ala iSu V^l Ser (J 
85 90 95 ' 

ATA ACT GCT CTC ATC CTT GTC CAC ATC ATT CCT GAT TTA TTA ACT GTC 1110 
He Thr Ala Leu Met Leu Val His He He Pro Asp Leu Leu^r Val 
105 110 115 

AAA ACT AGA GAA CTC TTC TTC AAA AAG AAA GCT GCA CAG CTT GAC CGT 1158 
Lys Thr Arg Glu Leu Phe Leu Lys Lys Lys Ala Ala Gin llu iSp^g 
120 125 130 

GAA ATG GCT ATT ATT CGG ACT CAG GAG GAG ACA GGT AGA CAT GTT AGA 1206 
Glu Met Gly He He Arg Thr Gin Glu Glu Thr Gly Arg His \^1 ira 
135 140 145 

ATC CTA ACT CAT GAA ATC CGA AGC ACT CTT GAT AGA CAT ACT ATT Tra i->c^ 
Met Leu Thr His Glu He Arg Ser Thr Leu Aip Sg hIs iL He l!^u 
150 155 2.60 

^ ^ ^ ^^.^'^ ACA TTC GCA TTC GAA GAG TCT 1302 

Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala Leu Glu Glu 6ys 

170 175 180 

GCA TTA TCG ATC CCA ACA CGT ACT GGA CTA GAG CTT CAG CTT TCT TAC 1350 
Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu Leu Gin Leu Ser ^r 
185 190 195 ' 



100 
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230 235 2«o 

GGT GAG GTC GTT GCT GTC AGG GTT CCA CTT CTG CAT CTG TCG AAC TTC 1542 
S?y SIS V?i ^1 Ala Val Arg Val Pro Leu Leu Hxs Leu Ser Asn Phe 
245 250 •^=>=' 

CAG ATT AAT GAT TCG CCT GAA CTT TCA ACA AAG CGC TAT GCT TTA ATC 1590 
Sn lYe Asp^p Pro Glu Leu Ser Thr Lys Arg Tyr Ala Leu Met 



?II SS ^u %7ol%^Wr^lP^^^^^^^ '''' 
280 285 

CTC GAG CTT GTT GAA GTC GTA GCT GAT CAG GTT TCATTTTTCT TATTCAAAAT 1691 
SS Sl5 S^l Glu Val Val Ala Asp Gin Val 

295 300 
TCCTTAATAT AATCTTAAAA TTTCTCTTTT ATATATTTTT GGGTTCAACA CAACCACGTT 1751 
GACATACTCA GTTCTCGGTC TAAAATTAGA CATGGAGAAG ACCAATTACA AAAATCTCAG 1811 
AATCTCCTAG CAGAATCACA AGGCTTAGTT GTTCTTAGTA TTATGGTTTT ATCCATTGGA 1871 
ATTGCACAGC AGAATTOTTA TTACTCTTAT TnTTTTTAA AATTTTCAAA GATAAATCAA 1931 
AAGCTCAACT ATATCACTTT TTCCATACTT CGTCTCCTCA TTGCmTTC GTCATGGAAT 1991 
AGTTAGGCTC GGTTOTCGAT GAGTATATCA TAGTAGATTT TCTCATAGGA TCTTAACTCC 2051 
TTCGCmrc TTTTCTATAG ATGATCCCTT GTATTAGAAG CACGGGAAAT AGGATCGATC 2111 
GTATATAGAA ATATTAGGAA CAGCTTTCTC AATCATTPGA ATATTCCTTT TATGGAACAT 2171 
AGAACTCTTO ACGTGTATCT AGTTTTCTTA GTACTTTTAT CATATCAAGT GAAAATAACG 2231 
•mrcCGATA ATCTATITCA GTCTCTAAAA TTAAATACTA CTCAGTTTTA CAAAAATAAT 2291 
TCTTCAACGG AAGCCATTTA TTrrrrTTAC ATATCTC3GCA TCTTACTTCT CCATCAAAGA 2351 
CTTTAGACAA CTTTAACrrr TTCATTCTGT CTCTCGTAGT GTACTGTTCT CTCATGTATC 2411 
TAATTAGCTC ACTGGCAAGT AGCACACCTA GTCTTTGTTT GACTTGTrTA AAAATCATCA 2471 
TGTATCATCA GTTACGGTCA AGTCTCCAAG TTTTACTGCT TTTTCCTATT TCCATTCCAG 2531 
AGTCTTAAAA CATTrCAGTT ATTCCTOGAT TTCTCCTGTT TATCAATGGA AAATTCAACT 2591 
ATCAACTATC CCTCAATCAA TAAATCAAAC CTCTATATCT AACCACTCCA ACTCACATCC 2651 
AGAAATCAGA TTTCAAAGAA ATTCATCATA ACTCAACTAT AGGATTCCTC TTAACCAAGA 2711 
GTAATCCrCA TITCTCCAGA CAGGCGACCA GCTATTATCC TITCATTATC GGAAAAATTC 2771 
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ACAATTAATT AAAGGAAGGA ACAACTGAAG AAAAGACATC CTTGTCAGCT TCCTCTCCCA 2831 

ACCCTTGCCT GAATAAGACA AAAAGTTTCT TGGAGAAAAC TCTGAATATT GGTATCCACC 2891 

TCCTTTCTCC TAATTTAGGA TGCTCTATTT CTAGACATAT AGGGGAATAC TCTATTCTAG 2951 

TGGTCGGTGT CTGGTTGCAA CTAGTTTTAG ATGTTTATAT GTCTTATTTG ATTTAATAAG 3011 

AGCTATCCTT GAGTGCCCAA TGTGATTTAA TCTACGCTTC GGCATTTCAG GTT GCT 3067 

Val Ala 
305 

GTT GCT CTT TCA CAT GCT GCT ATA TTA GAA GAA TCA ATG AGG GCT AGG 3115 
Val Ala Leu Ser His Ala Ala lie Leu Glu Glu Ser Met Arg Ala Ara 
310 315 320 

GAT CTT CTT ATG GAG CAG AAT GTG GCT CTT GAT CTG GCA AGA AGA GAA 3163 
Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg Glu 
325 330 335 

GCA GAA ATG GCT GTT CGT GCA CGT AAT GAT TTC TTG GCT GTT ATG AAT 3211 
Ala Glu Met Ala Val Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

CAT GAA ATG AGA ACT CCC ATG CAT GCA ATA ATT GCA CTT TCT TCC TTA 3259 
His Glu Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 
355 360 365 

CTA CAA GAA ATC GAT CTA ACT CCA GAG CAA CGT CTG ATG GTT GAA ACA 33 07 
Leu Gin Glu He Asp Leu Thr Pro Glu Gin Arg Leu Met Val Glu Thr 
370 375 380 385 

ATC CTC AAA AGC AGC AAC CTT TTA GCA ACG CTC ATC AAC GAT GTC TTG 3355 
He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu He Asn Asp Val Leu ^ ^ 
390 395 400 

GAT CTT TCA AGG CTA GAG GAT GGA AGT CTT CAA CTT GAT ATT GGC ACT 3403 
Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Asp He Gly Thr 
405 410 415 

TTC AAT CTC CAT GCT TTA TTT AGA GAG GTG CCCTTCATCA CCCTCTTTTC 34 53 

Phe Asn Leu His Ala Leu Phe Arg Glu Val 
420 ,425 

TTTTTTACTT GCAAATTCTA GATTACCTGT CAGAAAAAAA GTGTCATTAC AGATATTTTC 3513 

CACTTCAATA TGTTTGCTGG ACCTGCTGAC TGATATATGT GTCTGCTTAT TCCTGTAG 3571 

GTC CAT AGC TTA ATC AAG CCT ATT GCA TCT GTG AAA AAG TCT GTT GCT 3619 
Val His Ser Leu He Lys Pro He Ala Ser Val Lys Lys Ser Val Ala 
430 435 440 

CAA CTT AGT TTG TCG TCA GAT TTG CCG GAA TAT GTA ATT GGG GAT GAA 3667 
Gin Leu Ser Leu Ser Ser Asp Leu Pro Glu Tyr Val He Gly Aso Glu 
445 450 455 

AAA CGG TTA ATG CAA ATT CTC TTA AAC GTT GTT GGC AAT GCT GTA AAG 3715 
Lys Arg Leu Met . Gin He Leu Leu Asn Val Val Gly Asn Ala Val Lys 
460 465 470 475 

TTC TCA AAG GAA GGC AAC GTA TCA ATC TCC GCT TTT GTT GCA AAA TCA 3763 
Phe Ser Lys Glu Gly Asn Val Ser He Ser Ala Ph Val Ala Lys Ser 
480 485 490 
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GAC TCT TTA AGA GAT CCT AGA GCC CCT GAA TTO TTT GCT GTC CCT AGT 3811 
Asp Ser Leu Arg Asp Pro Arg Ala Pro Glu Phe Phe Ala Val Pro Ser 
495 500 505 

GAA AAT CAC TTC TAT TTA CGG GTG CAG GTATATTTTT ACAAGCTTGA 3858 
Glu Asn His Phe Tyr Leu Arg Val Gin 
510 515 

TATACTATCT TCGTAGGTTA AGGATAGTCA CAAATATGAT ATTTTAGACT TATAACTGTC 3918 

AGATGITCTG TTCTTGATAT TTGTAATATT CTAAGTAATA CTTTCTGTAG ATA AAA 3974 

He Lys 

GAT ACG GGG ATA GGA ATT ACA CCA CAG GAT ATT CCC AAC CTG TTT AGC 4022 
ASP Thr GlY He Gly He Thr Pro Gin Asp He Pro Asn Leu Phe Ser 
520 525 530 

AAG ITT ACA CAA AGC CAA GCG CTA GCA ACT ACA AAT TCT GGT GGC ACT 4070 
Lys Phe Thr Gin Ser Gin Ala Leu Ala Thr Thr Asn Ser Gly Gly Thr 
535 540 545 550 

GGG err GGT CTT GCA ATT TGT AAG AG GTACGGGTAC CAG1TCCTTA 4116 
Gly Leu Gly Leu Ala He Cys Lys Arg 
555 

GTGTTCITTT TCCGACTCTG ATTTTCATTC TACGTGAACT TGGTAACTGC TTCATATTCA 417 6 

ATTTCTTTCT CTTACTGTAT TTACGTATTG ACACATCTCC TGATGGGACA CAAAAAG G 4234 

TTT GTG AAT CTT ATG GAA GGA CAT ATT TGG ATT GAA AGT GAA GGT CTT 4282 
Phe Val Asn Leu Met Glu Gly His He Trp He Glu Ser Glu Gly Leu 
560 565 570 575 

GGC AAG GGG TCT ACT GCT ATA TTT ATC ATT AAA CTT GGA CTT CCT GGA 4330 
Gly Lys Gly Ser Thr Ala He Phe He He Lys Leu Gly Leu Pro Gly 
' 580 585 590 

CGT GCA AAT GAA TCT AAG CTC CCC TTT GTG ACC AAA TTG CCA GCA AAT 4378 
Arg Ala Asn Glu Ser Lys Leu Pro Phe Val Thr Lys Leu Pro Ala Asn 
595 600 605 

CAC ACG CAG ATG AGT TTT AAG GAT TAAAGGTTTT GGTGATGGAT GAGAATGGGT 4432 
His Thr Gin Met Ser Phe Lys Asp 
610 615 

GAGTACTATC TGGACCCCTT TATCCTCGAC TCTTGTCTTG CCATGCTGTT TAATGATCCA 4492 
TCTGATTGCG TGATTTCTCA TCTTATATGT ATTGAGCTGT CTTACTCACT TTACATGAGA 4552 
CTACAGTAAT ACTT ^^^^ 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) tSOGTH: 615 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
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Val Glu Ser Cys Asn Cys lie. He Asp Pro Gin Leu Pro Ala Asp Aso 
1 5 10 15 

Leu Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala Leu Ala 
20 25 30 

Tyr Phe Ser He Pro Val Glu Leu He Tyr Phe Val Lys Lys Ser Ala 
35 40 45 

Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val 
50 55 60 

Leu Cys Gly Ala Thr His Leu He Asn Leu Trp Thr Phe Asn Met His 
65 70 75 80 

Thr Arg Asn Val Ala He Val Met Thr Thr Ala Lys Ala Leu Thr Ala 
85 90 95 

Leu Val Ser Cys He Thr Ala Leu Met Leu Val His He He Pro Asp 
100 105 110 

Leu Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Lys Lys Ala Ala 
115 120 125 

Gin Leu Asp Arg Glu Met Gly He He Arg Thr Gin Glu Glu Thr Glv 
130 135 140 

Arg His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Ara' 
145 150 155 160 

His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Ala 
165 170 175 

Leu Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Leu Glu Leu 
180 185 190 

Gin Leu Ser Tyr Thr Leu Arg His Gin Asn Pro Val Gly Leu Thr Val 
195 200 205 

Pro He Gin Leu Pro Val He Asn Gin Val Phe Glv Thr Asn His Val 
210 215 220 

Val Lys He Ser Pro Asn Ser Pro Val Ala Arg Leu Arg Pro Ala Gly 
225 230 235 240 

Lys Tyr Met Pro Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg 
260 265 270 

Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin Trp 
275 280 285 

His Val His Glu Leu Glu Leu Val Glu Val Val Ala Asp Gin Val Val 
290 295 300 

Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg Ala 
305 310 315 320 

Arg Asp Leu L u Met Glu Gin Asn Val Ala Leu Asp Leu Ala Arg Arg 
325 330 335 
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Glu Ala Glu Met Ala Val Arg Ala Arg Asn Asp Phe Leu Ala Val Met 
340 345 350 

Asn His Glu Met Arg Thr Pro Met His Ala lie lie Ala Leu Ser Ser 
355 360 365 

Leu Leu Gin Glu lie Asp Leu Thr Pro Glu Gin Arg Leu Met Val Glu 
370 375 380 

Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu He Asn Asp Val 
385 390 395 400 

Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Gin Leu Asp He Gly 
405 410 415 

Thr Phe Asn Leu His Ala Leu Phe Arg Glu Val Val His Ser Leu lie 
420 425 430 

Lys Pro He Ala Ser Val Lys Lys Ser Val Ala Gin Leu Ser Leu Ser 
435 440 445 

Ser Asp Leu Pro Glu Tyr Val He Gly Asp Glu Lys Arg Leu Met Gin 
450 455 460 

He Leu Leu Asn Val Val Gly Asn Ala Val Lys Phe Ser Lys Glu Gly 
465 470 475 480 

Asn Val Ser He Ser Ala Phe Val Ala Lys Ser Asp Ser Leu Arg Asp 
485 490 495 

Pro Arg Ala Pro Glu Phe Phe Ala Val Pro Ser Glu Asn His Phe Tyr 
500 505 510 

Leu Arg Val Gin He Lys Asp Thr Gly He Gly He Thr Pro Gin Asp 
515 520 525 

He Pro Asn Leu Phe Ser Lys Phe Thr Gin Ser Gin Ala Leu Ala Thr 
530 535 540 

Thr Asn Ser Gly Gly Thr Gly Leu Gly Leu Ala He Cys Lys Arg Phe 
545 550 555 560 

Val Asn Leu Met Glu Gly His He Trp He Glu Ser Glu Gly Leu Gly 
565 570 575 

Lys Gly Ser Thr Ala He Phe He He Lys Leu Gly Leu Pro Gly Arg 
^ 580 585 590 

Ala Asn Glu Ser Lys Leu Pro Phe Val Thr Lys Leu Pro Ala Asn His 
595 600 605 

Thr Gin Met Ser Phe Lys Asp 
610 615 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEWTH: 737 base pairs 

(B) TVPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: lin ar 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 33.. 719 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

AAGATAAGAG TGATTCATTA AGGAGTTTGT TC ATC ATG GAT TGT AAC TGC TTC 53 

lie Met Asp Cys Asn Cys Phe 
1 5 

GAT CCA CTG TTG CCT GCC GAT GAG TTG TTA ATG AAG TAT CAG TAC ATT 101 
Asp Pro Leu Leu Pro Ala Asp Glu Leu Leu Met Lys Tyr Gin Tyr lie 
10 15 20 

TCT GAT TTT TTC ATT GCA GTT GCT TAT TTT TCC ATC CCA ATC GAA CTG 149 
Ser Asp Phe Phe He Ala Val Ala Tyr Phe Ser He Pro He Glu Leu 
25 30 35 

GTA TTC TTT GTC CAG AAA TCA GCT GTT TTT CCG TAT CGA TCG GTG CTT 197 
Val Phe Phe Val Gin Lys Ser Ala Val Phe Pro Tyr Arg Trp Val Leu 
40 45 50 55 

GTG CAG TTT GGT GCT TTC ATA GTT CTT TGT GGA GCA ACA CAC CTT ATC 245 
Val Gin Phe Gly Ala Phe He Val Leu Cys Gly Ala Thr His Leu He 
60 65 70 

AAT TTG TGG ACT TCT ACT CCT CAT ACA AGG ACT GTG GCA ATG GTG ATG 293 
Asn Leu Trp Thr Ser Thr Pro His Thr Arg Thr Val Ala Met Val Met 
75 80 85 

ACT ACG GCG AAG TTC TCC ACT GCT GCG GTA TCA TGT GCA ACT GCT GTC 341 
Thr Thr Ala Lys Phe Ser Thr Ala Ala Val Ser Cys Ala Thr Ala Val 
90 95 100 

ATG CTT GTC GCA ATT ATT CCG GAT TTA TTA AGT GTC AAA ACT AGG GAG 389 
Met Leu Val Ala He He Pro Asp Leu Leu Ser Val Lys Thr Arg Glu 
105 110 115 

CTA TTC TTG AAA AAC AAA GCG GCG GAA CTT GAT CGT GAA ATG GGT CTT 437 
Leu Phe Leu Lys Asn Lys Ala Ala Glu Leu Asp Arg Glu Met Gly Leu 
120 125 130 135 

ATT CGG ACA CAG GAG GAG ACG GGT AGA TAT GTT AGA ATG CTA ACA CAT 485 
He Arg Thr Gin Glu Glu Thr Gly Arg Tyr Val Arg Met Leu Thr His 
140 145 150 

GAA ATC AGA AGT ACT CTG GAT AGA CAT ACT ATT TTG AAG ACT ACA CTT 533 
Glu He Arg Ser Thr Leu Asp Arg His Thr He Leu Lys Thr Thr Leu 
155 160 165 

GTT GAA CTT GGA AGA GCA TTG CAA CTG GAA GAG TGT GCT TTG TCG ATC 581 
Val Glu Leu Gly Arg Ala Leu Gin Leu Glu Glu Cys Ala I^u Trp Met 
170 175 180 

CCG ACT CGA ACT GGA GTC GAG CTT CAA CTT TCT TAC ACT TTA CAT CAT 629 
Pro Thr Arg Thr Gly Val Glu Leu Gin Leu Ser Tyr Thr Leu His His 
185 190 195 

CAA AAT CCA GTT GGA TTT ACA GTA CCT ATA CAA CTC CCT GTA ATT AAT 677 
Gin Asn Pro Val Gly Ph Thr Val Pro He Gin Leu Pro Val H Asn 
200 205 210 215 
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CAA GTT TTC AGT GCA AAT TGT OCT GTT AAA ATT TCA CCT TAATCTGCCG 726 
Gin Val Phe Ser Ala Asn Cys Ala Val Lys lie Ser Pro 
220 225 

TTGCAAGGCT T '^^'^ 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 228 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

lie Met Asp Cys Asn Cys Phe Asp Pro Leu Leu Pro Ala Asp Glu Leu 
1 5 10 15 

Leu Met Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala Val Ala Tyr 
20 25 30 

Phe Ser He Pro lie Glu Leu Val Phe Phe Val Gin Lys Ser Ala Val 
35 40 45 

Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe He Val Leu 
50 55 60 

CVS Gly Ala Thr His Leu He Asn Leu Trp Thr Ser Thr Pro His Thr 
65 70 75 80 

Arg Thr Val Ala Met Val Met Thr Thr Ala Lys Phe Ser Thr Ala Ala 
85 90 95 

Val Ser Cys Ala Thr Ala Val Met Leu Val Ala He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Thr Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

Tyr Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His 
145 150 155 160 

Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Ala Leu Gin Leu 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr Gly Val Glu Leu Gin 
180 1S5 190 

Leu Ser Tyr Thr Leu His His Gin Asn Pro Val Gly Phe Thr Val Pro 
195 200 205 

He Gin Leu Pro Val He Asn Gin Val Phe Ser Ala Asn Cys Ala Val 
210 215 220 

Lys He Ser Pro 
225 



wo 95/01439 



PCT/US94/07418 



113 



(2) INFORMATION FOR SEQ ID NO: 41: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6202 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join (3522 .. 5288, 5372.. 5926) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GAATTCGAAC TGCAATGGGA TAAACATTAT ATGCGTTTTA ATAATAGGTT GGTCAAGTTT 60 
ATAATTTACA CCATTTGAAA AGCCTTCCAA ATTTAGAAAC TACATTTTTG CAGACCCATG 120 
TGAGCTCATA TGAATCAATC ATAGCCTTGA TGTTGTAAAA CAAATTATGA TTATAAAAAT 180 
GTGATAGTAT ATTACATGCA TAAAAAATAA AGGAGAGTAA ATGAAAGTCA AATCTCGGTT 240 
TTATGAACTG AAAGTTGAAG TTTAGAAGTA GAAGTAGCGA TCAAAGTATG ACCAGTTAAA 3 00 
AGGCCCAATA TCATTTGGAG GTTTGATTrT TGGGTTCGTA AATTTCAAGA GCCAGATTAT 360 
GATTTGCTGG GCTTAAAAAT CATGGAAAAA TTGAAATGAC GGTGTTAAAA TATATAACTC 420 
AAATTAAAGA TTTTAATTGG GTGTAGTAGG CTGATTTTTT TATAAGAATC TTGTCTATAG 480 
ATGCTTCAAG GTTATGCCTT ATAGTACTGG TTGTAAAACA CCACTATCTA ATTTTGAAGC 540 
TGGTCAGAAC TATAAGGTAT GTTCTTGTTC GCCTTGTTGC TAATCAAGAT TATAACATTC 600 
TGTTGTTGCA TTTTTTTTTT TTmTTGTG TTAAATATAT ATATTTTTTT TCCATATTTA 660 
TTCTTGCATA TTGTGTTGCA TATTTAGTAA TGGTTACATT CCCTGTTATC GGAGACCAAG 720 
ATAATACGGC TCTGTGGCAT GGACTACTAC TCCATGGATT CTTCCAAGTA ATCTroCTIT 780 
GTGTGTCAAT GCAAAGTTTG TTTATCTTAA GGTTCGTCAA CAACACTCGA AAAGTCTACA 840 
1TGTTGCTGA ATCTCGGTIXS TCATCGCTTC CTACTGATAA GCCTAAGGCC GGCTTAACTA 900 
ATGGAACTTA CTAGTGATAC CATAATGCGA AAGGTGCTAA TTAACCTTCA CACTCAAGAG 960 
GATTCTTATC AACTTTTGGA AAATTTTAAT GGAGATTCCT TGGTTCGGAA GAAGTATCAA 1020 
CCTTTGTTTG ATTACTTTTA GCGATTTCTC AAGTGTGACT TTTCGACTAG TAGCAGATGA 1080 
TTATGTCATG AATGATAGTG CTACTGGTAT TGTCCATTGT GCTCCTCTCT TTGGTX5CAGA 1140 
TGACTATCCT CTTTGTCTTG AGAACGAGAT AATTAAGAAG GTTAGATTTC ACAACATCTT 1200 
CCTTATATCA CCACCTTTAA CATTAAGTTT ATTTTCTTTC TTGTTTAAGT TTACAGTATC 1260 
TTCAAGAACC CATGTTCATG ACACATTTTG TTCATGTGTT GTTTAGATro TCAGAGATIT 1320 
CAAACGTCCA GATGGTTTGA AAGATACAGA CATTGATCCA GCTGTAGATA GTACATATCT 1380 
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TAATTAAAAA TACCACTTCT CTATCCTCTA TTGTTGAGGA AACATATAAT ATTTGCATTC 1440 
GTTCATCGTr CAGATATCAT GTTATGGTAA TTCTTGATCT ACGAGAAGAT GAATCTTTGA 1500 
AAAACGAAGG TCITGCCCGT GAGGTAAATA AATGTAACCG AAGCGATTAA TGGTCATATA 1560 
TAAGTTGTAT ATITCATATA TCGGTTTCCT TCTCATTGTG CTCATGCATT GAAAAGCACC 1620 
CTGTTATGAC TCTCGTTCTA GGAGAACATT TGCATTTGAC AGTCGGTGAC TAATTGTTAA 1680 
GCAAGAAGAA CGCATCAGAG CCnTTAAAG TGTrrTCITC TAGATCGTTG CAAAAAGTTA 1740 
AATCTCTCTT GAGACTITCT ACTCATTCTA TAGATAAAGA TGGGATTTAT TACAAAAACA 1800 
ACAAGAAACT TTOTrACTrG TCGAAATTCA AAATTATCCG AACTAGCTTC ACAAAATATG 1860 
CTCAAGAGTT TCAATCTATT -mrmO TT CTGTAATTGT ATGACTCCGT TTGAAGCATC 1920 
AAGATTATCG TTATAGGTAG TCATCCTAAA ACTCTCTGTT GTTACAGTGA CCACTAAAAA 1980 
CACCAACAAA AAAAACTTAG GTAACGTGTC GTCTAAAAAC TTCTAGGTrC AATITCTTTA 2040 
GATAGTACTA TCAATAAATA AAATAAATAT GTACAAAGGC TTTAAACAAT GATGTrTTTC 2100 
AAAGATCATT GGTAGATACT AATTAGAGCT TCAATATAAA AGAACACATG CGATTCTGAC 2160 
ATTCTGTGGT CTAACATGGT TTCTTCTAGA GTCAAAACCA TACAATTAAA AGTTAGGAAA 2220 
GTAATAGCAA TGTGGTTTCA AATATATACT CATTACTCTT TAGATTCATG TATGGTGAAG 2280 
GAAACATTAT AATAAAATCA AAGATCACAG TrTTGTAGGT CCCTCATATT AATCAACATC 2340 
TTAAGGCGTT ATACATATCT TdTTPrGTA AATATTTGAC TAATTAAAAT ATCTAATTAG 2400 
AGTATTAGAC TAATCTCATC AAATATCCGA CTACTTGTGT CAGTTCAAAA CACAGTGATT 2460 
ACGTTAGATT TTCTGCTCTT TTGTTTATAA ACAAAGCTAA TTTAAGAAAT ATATGATCTA 2520 

-rrreccTCCT tggtcttaat •rrTATACTrr cttggaataa aacacattta ttaaaataat 2580 

TTTTAGGGTC CTAGATTCAT GTCATGTGGC TTGATAGTrT CCAACAATTA TACCAATATT 2640 
TTACTCATTC ATATACAAAT AAACAAGCTT TATTCTATTC TTCAGTCTCA TOATATACGG 2700 
GATriTCATA AAATTCAGAG TACCCATTAA TTATTCTATG TTACAGCTTG TAATAAGTTA 2760 
AATTTATAAA ACGTACAAGT TGAGGAAATA ACAAATGTTT TCAATATTAA ATGATTTATT 2820 
AATACATTAG TGACCAAAAA ATTATTAAGT GTAAOAAAAA AAACACAACT CAGAAAAAAT 2880 
TCAAAAGACC GTCTAAGTTC GGTTCATGTA AGAACAAGTG GGACCTCnT AAGTTTCTAA 2940 
ATCAGAGAAT AAAGAAGAAG AAAAAATCTC AAAACCTTCC TCTAAAACCA ACGGCTCCTA 3000 
CCTTTACrrA CACCCTATAC ATACACTTCT CTTTTTATCC TCCATCGGCO GCTTATOGCG 3060 
GTirrCCGGC ACTAATCATC TCCGGCATAT ATAAATAAAC GTACTTCACG TrTTTTTATA 3120 
TAACTTCAAA GTACTTTCAG ATTTGTCTCT ATCTCTTCAC TTTTAAGTCT TCTGGTTTTG 3180 
TCATCACCAG CTmTTTCT TCTCTCTCTG TCTCTGTCTC TOTCTTTCTC TTTGTGTATT 3240 
TTTATrCTCG TCATCGTTCT TCTTCTATGA GAGGAAGATC GGAATGTCGA AGAGAATTAG 3300 
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AAGATTCTCG TACATCACTT CGTTCGAATT TCACAGGTCG ATGAGAGATC TCAGAACTCT 3360 

TTCATTTTGA TCCAAACTCA TCTCTTTCAG GTATTCCAAA TTTGTCTTTC TCrerrcrrT 3420 

CTACTATTAC CCAAATTAAA GTITTGArrT TTATTTCTCA CTCTCTTTCT TCTITITCTA 3480 

ATTGCAGAGT ATAATGGACT AAGCATTTTT TTTCTCCGAA G ATG GTT AAA GAA 3 533 

Met Val Lys Glu 
1 

ATA GCT TCT TGG TTA TTG ATA CTA TCA ATG GTG GTG TTT GTT TCT CCG 3581 
He Ala Ser Trp Leu Leu He Leu Ser Met Val Val Phe Val Ser P^ro 
^ 10 15 20 

GTT TTA GCT ATA AAC GGC GGT GGT TAT CCA CGA TCT AAC TCC GAA GAC 3 
Val Leu Ala He Asn Gly Gly Gly Tyr Pro Arg Cys AsiT^s Glu 
25 30 35 

GAA GGA AAC AGT TTC TGG AGT ACA GAG AAC ATT CTA GAA ACT CAA ACA tc-Ji 
Glu Gly Asn Ser Phe Trp Ser Thr Glu Asn He Leu Glu Thr Gln^a 
40 45 50 ^ 

GTA AGC GAT TTC TTA ATC GCA GTA GCT TAT TTC TCA ATC CCT ATT car -JT^e 
val Ser Asp Phe Leu He Ala Val Ala lyr Phe Ser He Pro He Glu 
=5 60 65 

TTA CTT TAC TTC GTC AGT TCT TCC AAT GTT CCA TTC AAA TCG GTT CTC 3773 
Leu Leu Tyr Phe Val Ser Cys Ser Asn Val Pro Phe Lys^p^l Leu 
'0 75 80 

TTT GAG TTT ATC GCC TTC ATT GTT CTT TCT GGT ATC ACT CAT CTT CTT ' "3821 
Phe Glu Phe He Ala Phe He Val Leu Cys Gly Met Thr His Leu Leu 

»0 95 

CAT GGT TCG ACT TAC TCT GCT CAT CCA TTT AGA TTA ATC ATC GCG TTT 3869 
His Gly Trp Thr lyr Ser Ala His Pro Phe Arg Leu Met Met Ala W>e 
105 110 225 

ACT GTT TTC AAG ATC TTC ACT GCT TTA GTC TCT TCT GCT ACT GCG ATT '3917 
Thr Val Phe Lys Met Leu Thr Ala Leu Val Ser Cys Ala Thr Ala lie 
120 125 130 

ACG ATT ACT TTC ATT CCT CTC CTT TTC AAA GTT AAA GTT AGA GAG 3965 

Thr Leu He Thr Leu He Pro Leu Leu Leu Lys Val Lys Val Glu 
135 140 145 

TTT ATC CTT AAG AAG AAA GCT CAT GAC CTT GGT CGT GAA GTT GGT TTO 4013 
Phe Met Leu Lys Lys Lys Ala His Glu Leu Gly Arg Glu Val Gly Leu 
150 155 260 

ATT TTC ATT AAG AAA GAC ACT GCC TTT CAT GTT CGT ATC CTT ACT CAA 4061 
He Leu He Lys Lys Glu Thr Gly Phe His Val Arg Met^u ?ir Gin 

170 175 180 

GAG ATT CGT AAG TCT TTC GAT CGT CAT ACG ATT CTT TAT ACT ACT TTC 4109 
Glu He Arg Lys Ser Leu Asp Arg His Thr He Leu Tyr Thr Thr Leu 
185 190 195 

Crn- GAG CTT TCG AAG ACT TTA GGC TTC CAG AAT TCT CCG GTT TGG ATC 4157 
Val Glu Leu Ser Lys Thr Leu Gly Leu Gin Asn Cys Ala Val Trp Met 
200 205 210 
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CCG AAT GAC GGT GGA ACG GAG ATG GAT TTG ACT CAT GAG TTG AGA GGG 4205 
Pro Zn Asp Gly Gly Thr Glu Met Asp Leu Thr H.s Glu Leu Arg Gly 
215 220 225 

AGA GGT GGT TAT GGT GGT TGT TCT GTT TCT ATG GAG GAT TTG GAT GTT 4253 
J?g Sy cly Tyr Gly Gly Cys Ser Val Ser Met Glu Asp Leu Asp Val 

GTT AGG ATT AGG GAG AGT GAT GAA GTG AAT GTG TTG ACT C3TT GAC TCG 43 01 
?IT lie Olu ser Asp Glu Val Asn Val Leu Ser Val Asp Ser 

245 250 255 260 

Trr ATT GCT CGA GCT AGT GGT GGT GGT GGG GAT GTT AGT GAG ATT GGT 4349 
ill ?S Ala^S Gly Gly Gly Gly Asp Val Ser Glu lie Gly 
265 270 2/b 

err GTG GCT GCT ATT AGA ATG CCG ATG CTT CGT GTT TCG GAT TTT AAT 4397 
Ma 5S A^S iVe Arg Met Pro Met Leu Arg Val Ser Asp Phe Asn 
280 285 290 

riTA GAG CTA AGT TAT GCG ATA CTT GTT TGT GTT TTA CCG GGC GGG ACC 4445 
fly %Tu Jen Ser Ala He Leu Val Cys Val Leu Pro Gly Gly Thr 

295 300 305 

rcT CGG GAT TGG ACT TAT CAG GAG ATT GAG ATT GTT AAA GTT GTG GCG 4493 
5S Sp^lS- TVr Gin Glu He Glu lie Val Lys Val Val Ala 

310 315 320 

GAT CAA GTA ACC GTT GCG TTA GAT CAT GCA GCG GTT CTT GAA GAG TCT 4541 
HI JS. %ll Thr Tl Ala Leu Asp His Ala Ala Val Leu Glu Glu Ser 
325 330 335 

CAG CTT ATG AGG GAG AAG CTG GCG GAA CAG AAC AGG GCG TTG CAG ATG 4589 
5ln Leu Met Arg Glu Lys Leu Ala Glu Gin Asn Arg Ala Leu Gin Met 
345 350 

rrc AAG AGA GAC GCG TTG AGA GCG AGC CAA GCG AGG AAT GCG TTT CAG 4637 
III m ASP Ala ^u Arg Ala Ser Gin Ala Arg Asn Ala Phe Gin 

360 365 370 

AAA ACG ATC AGC GAA GGG ATG AGG CGT CCT ATG CAT TCG ATA CTC GGT 4685 
^ Thr Mel Ser Glu Gly Met Arg Arg Pro Met His Ser He Leu Gly 
375 380 385 

ITPP TTG TCG ATC ATT CAG GAC GAG AAG TTG AGT GAC GAG CAG AAA ATC 4733 
S5 Met lYe Gin Asp Glu Lys Leu Ser Asp Glu Gin Lys Met 
390 395 400 

ATT GTT GAT ACG ATC GTT AAA ACA GGG AAT GTT ATC TCG AAT TTG CTC 4781 
lie Val Asp Thr Met Val Lys Thr Gly Asn Val Met Ser Asn Leu Val 
405 410 *15 

GGG GAC TCT ATC GAT GTC CCT GAC GGT AGA TTT GGT ACG GAG ATC AAA 4829 
Sp Sr Met ASP Val Pro Asp Gly Arg Phe Gly Thr Glu Met Lys 
' 425 *30 435 

CCG TTT AGT CTC CAT CGT ACG ATC CAT GAA GCA GCT TGT ATC GCG AGA 4877 
Irl pS Ser Le^ His Arg Thr He His Glu Ala Ala Cys Met Ala Arg 
440 **5 

TGT TTC TCT CTA TCC AAT GGA ATT AGG TTC TTC GTT GAC GCG GAG AAG 4925 
Ss Leu ^s Leu Cys Asn Gly He Arg Phe Leu Val Asp Ala Glu Lys 
455 460 
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TCT CTA CCT GAT AAT GTA GTA GGT GAT GAA AG A AGG GTC TTT CAA GTG 497 3 
Ser Leu Pro Asp Asn Val Val Gly Asp Glu Arg Arg Val Phe Gin Val 
470 475 480 

ATA CTT CAT ATG GTT GGT AGT TTA GTA AAG CCT AGA AAA CGT CAA GAA 5021 
lie Leu His Met Val Gly Ser Leu Val Lys Pro Arg Lys Arg Gin Glu 
485 490 495 500 

GGA TCT TCA TTG ATG TTT AAG GTT TTG AAA GAA AGA GGA AGC TTG GAT 5069 
Gly Ser Ser Leu Met Phe Lys Val Leu Lys Glu Arg Gly Ser Leu Asn 
505 510 515 

AGG AGT GAT CAT AGA TGG GCT GCT TGG AGA TCA CCG GCT TCT TCA GCA 5117 
Arg Ser Asp His Arg Trp Ala Ala Trp Arg Ser Pro Ala Ser Ser Ala 
520 525 530 

GAT GGA GAT GTG TAT ATA AGA TTT GAA ATG AAT GTA GAG AAT GAT GAT 5165 
Asp Gly Asp Val Tyr He Arg Phe Glu Met Asn Val Glu Asn Asp Asd 
535 540 545 

TCA AGT TCT CAA TCA TTT GCT TCT GTT TCC TCC AGA GAT CAA GAA GTT 5213 
Ser Ser Ser Gin Ser Phe Ala Ser Val Ser Ser Arg Asp Gin Glu Val 
550 555 560 

GGT GAT GTT AGA TTC TCC GGC GGC TAT GGG TTA GGA CAA GAT CTA AGC 5261 
Gly Asp Val Arg Phe Ser Gly Gly Tyr Gly Leu Gly Gin Asp Leu Ser 
565 570 575 580 

TTT GGT GTT TGT AAG AAA GTG GTG CAG GTGAGTTTCC TTACATATCT 53 08 

Phe Gly Val Cys Lys Lys Val Val Gin 
585 

CTTTCTAAAG TTCCTGTCAT TAGTCTGAGT TTCTGTTTAG GAGTTCTTTG ATAATGTGTG 53 68 

CAG TTG ATT CAT GGG AAT ATC TCG GTG GTC CCT GGC TCG GAT GGT TCA 5416 
Leu He His Gly Asn He Ser Val Val Pro Gly Ser Asp Gly Ser 
590 595 600 

CCG GAG ACC ATG TCG TTG CTC CTT CGG TTT CGA CGT AGA CCC TCC ATA 5464 
Pro Glu Thr Met Ser Leu Leu Leu Arg Phe Arg Arg Arg Pro Ser He 
€05 610 615 620 

TCT GTC CAT GGA TCC AGC GAG TCG CCA GCT CCT GAC CAC CAC GCT CAC 5512 
Ser Val His Gly Ser Ser Glu Ser Pro Ala Pro Asp His His Ala His 
625 630 635 

CCA CAT TCG AAT TCT CTG TTA CGT GGC TTA CAA GTT TTA T1X3 GTA GAC 5560 
Pro His Ser Asn Ser Leu Leu Arg Gly Leu Gin Val Leu Leu Val Asp 
640 645 650 

ACC AAC GAT TCG AAC CGG GCA GTT ACA CGT AAA CTC TTA GAG AAA CTC 5608 
Thr Asn Asp Ser Asn Arg Ala Val Thr Arg Lys Leu Leu Glu Lys Leu 
655 660 665 

GGG TGC GAT GTA ACC GCG GTT TCC TCT GGA TTC GAT TGC CTT ACC CCC 5656 
Gly Cys Asp Val Thr Ala Val Ser Ser Gly Phe Asp Cys Leu Thr Ala 
670 675 680 

ATT GCT CCC GGC TCG TCC TCG CCT TCT ACT TCG TTT CAA GTG GTG GTG 5704 
He Ala Pro Gly Ser Ser Ser Pro Ser Thr Ser Ph Gin Val Val Val 
685 690 695 700 
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crrr gat CTT CAA ATG GCA gag ATG GAC GGT tat GAA GTG GCC ATG AGG 5752 
Sp Leu Gin Met Ala Glu Met Asp Gly Tyr Glu Val Ala Met Arg 
705 710 715 



ATC AGG AGT CGA TCT TGG CCG TTG ATT GTG GCG ACG ACA GTG AGC TTG 5800 
ill Arg Ser ^^g Ser Trp Pro Leu He Val Ala Thr Thr Val Ser Leu 
720 725 730 

GAT GAA GAA ATG TGG GAC AAG TGT GCA CAG ATT GGA ATC AAT GGA GTT 5848 
Isl ?Si Glu Met ^ Asp Lys Cys Ala Gin lie Gly lie Asn Gly Val 
735 740 745 

GTG AGA AAG CCA GTG GTG TTA AGA GCT ATG GAG AGT GAG CTC CGA AGA 5896 
Val Arg lJs Pro Val Val Leu Arg Ala Met Glu Ser Glu Leu Arg Arg 
750 755 760 

GTA TTG TTG CAA GCT GAC CAA CTT CTC TAAGTTGTTA TCTCAACTTC 5943 
Val Leu Leu Gin Ala Asp Gin Leu Leu 
765 770 

TCTTCTACAT TCAAAATTTT TACACCATAG ATTTATGTCA AATATATCAA AATGAAATTT 6003 
CGAAATTGTT ATTATATATA CCACCCATAT CTCTATGATT TGTACATCCT GTTTTTTTTT 6063 
GTTCnTTTC TCATTTTGAA CCCCACGAAA TTGCATTGAA TCTTAGTATT TCGTAGGGTC 6123 
AAGAAGGAGT CAGTTTCGTA GTTTTTTGTT TTCTTTATGT TACGAACTTA CGAAACTGAA 6183 
TATGGCATTA TAGAGTTTT ^^^^ 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 773 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Val Lys Glu He Ala Ser Trp Leu Leu He Leu Ser Met Val Val 
1 5 10 15 

Phe Val Ser Pro Val Leu Ala He Asn Gly Gly Gly Tyr Pro Arg Cys 
20 25 30 

Asn Cys Glu Asp Glu Gly Asn Ser Phe Trp Ser Thr Glu Asn He Leu 
35 40 45 

Glu Thr Gin Arg Val Ser Asp Phe Leu He Ala Val Ala Tyr Phe Ser 
50 55 €0 

He Pro He Glu Leu Leu Tyr Phe Val Ser Cys Ser Asn Val Pro Phe 
65 70 75 80 

Lys Trp Val Leu Phe Glu Phe He Ala Phe He Val Leu Cys Gly Met 
85 90 95 

Thr His Leu Leu His Gly Trp Thr Tyr Ser Ala His Pro Phe Arg Leu 
100 105 110 
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Met Met Ala Phe Thr Val Phe Lys Met Leu Thr Ala Leu Val Ser Cys 

120 125 

Ala Thr Ala He Thr Leu He Thr Leu He Pro Leu Leu Leu Lvs Val 
130 135 140 

Lys Val Arg Glu Phe Met Leu Lys Lys Lys Ala His Glu Leu Gly Aro 
145 150 155 165 

Glu Val Gly Leu He Leu He Lys Lys Glu Thr Gly Phe His Val Ara 
165 170 175 ^ 

Met Leu Thr Gin Glu He Arg Lys Ser Leu Asp Arg His Thr He Leu 
ISO 185 190 

Tyr Thr Thr Leu Val Glu Leu Ser Lys Thr Leu Gly Leu Gin Asn Cys 
155 200 205 

Ala Val Trp Met Pro Asn Asp Gly Gly Thr Glu Met Asp Leu Thr His 
210 215 220 

Glu Leu Arg Gly Arg Gly Gly Tyr Gly Gly Cys Ser Val Ser Met Glu 

230 235 240 

Asp Leu Asp Val Val Arg He Arg Glu Ser Asp Glu Val Asn Val Leu 
245 250 255 

Ser Val Asp Ser Ser He Ala Arg Ala Ser Gly Gly Gly Gly Asp Val 
260 265 270 

Ser Glu He Gly Ala Val Ala Ala He Arg Met Pro Met Leu Arg Val 
275 280 285 

Ala He Leu Val Cys Val Leu 
290 295 300 

Pro Gly Gly Thr Arg Arg Asp Trp Thr Tyr Gin Glu He Glu He Val 

310 315 320 

Lys Val Val Ala Asp Gin Val Thr Val Ala Leu Asp His Ala Ala Val 
325 330 335 

Leu Glu Glu Ser Gin Leu Met Arg Glu Lys Leu Ala Glu Gin Asn Ara 
340 345 350 ^ 

Ala Leu Gin Met Ala Lys Arg Asp Ala Leu Arg Ala Ser Gin Ala Ara 
355 360 365 

^i? ^1" ^ly Met Arg Arg Pro Met His 

370 375 380 

Ser He Leu Gly Leu Leu Ser Met He Gin Asp Glu Lys Leu Ser Asp 

390 395 400 

Glu Gin Lys Met He Val Asp Thr Met Val Lys Thr Gly Asn Val Met 
405 410 415 

Ser Asn Leu Val Gly Asp Ser Met Asp Val Pro Asp Gly Arg Phe Gly 
^20 425 430 

Thr Glu Met Lys Pro Phe Ser Leu His Arg Thr He His Glu Ala Ala 
435 440 445 
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Cys Met Ala Arg Cys Leu Cys Leu Cys Asn Gly lie Arg Phe Leu Val 

455 *ov 



450 



ASP Ala Glu Lys Ser Leu Pro Asp Asn Val Val Gly Asp Glu Arg Arg 
465 470 

Val Phe Gin Val lie Leu His Met Val Gly Ser Leu Val Lys Pro Arg 



485 



Lys Arg Gin Glu Gly Ser Ser Leu Met Phe Lys Val Leu Lys Glu Arg 



soo 



Gly ser Leu Asp Arg Ser Asp His Arg Trp Ala Ala Trp Arg Ser Pro 

515 520 

Ala ser Ser Ala Asp Gly Asp Val Tyr He Arg Phe Glu Met Asn Val 

530 535 
Glu Asn ASP Asp Ser Ser Ser Gin Ser Phe Ala Ser Val Ser Ser Arg 

545 550 

ASP Gin Glu val Gly Asp Val Arg Phe Ser Gly Gly Tyr Gly Leu Gly 



565 



Gin ASP Leu Ser Phe Gly Val Cys Lys Lys Val Val Gin Leu He His 
580 585 

Gly Asn He Ser Val Val Pro Gly Ser Asp Gly Ser Pro Glu Thr Met 

595 600 

ser Leu Leu Leu Arg Phe Arg Arg Arg Pro Ser lie Ser Val His Gly 

610 "5 "0 

ser Ser Glu Ser Pro Ala Pro Asp His His Ala His Pro His Ser Asn 

625 630 

ser Leu Leu Arg Gly Leu Gin Val Leu Leu Val Asp Thr Asn Asp Ser 
645 650 »3» 

Asn Arg Ala Val Thr Arg Lys Leu Leu Glu Lys Leu Gly Cys Asp Val 

660 

Thr Ala Val Ser Ser Gly Phe Asp Cys Leu Thr Ala lie Ala Pro Gly 
675 680 68b 

Ser Ser Ser Pro Ser Thr Ser Phe Gin Val Val Val Leu Asp Leu Gin 
690 695 700 

Met Ala Glu Met Asp Gly Tyr Glu Val Ala Met Arg He Arg Ser Arg 

705 

ser Trp Pro Leu lie Val Ala Thr Thr Val Ser Leu Asp Glu Glu Met 

Trp ASP Lys cys Ala. Gin He Gly lie Asn Gly Val Val Arg Lys Pro 

Val Val Leu Arg Ala Met Glu Ser Glu Leu Arg Arg Val Leu Leu Gin 

760 '^^ 



755 

Ala Asp Gin Leu Leu 
770 
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(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2404 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2322 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

ATG GTT AAA GAA ATA GCT TCT TGG TTA TTG ATA CTA TCA ATG GTG GTG 48 
Met Val Lys Glu lie Ala Ser Trp Leu Leu lie Leu Ser Met Val Val 
15 10 15 

TTT GTT TCT CCG GTT TTA GCT ATA AAC GGC GGT GGT TAT CCA CGA TGT 96 
Phe Val Ser Pro Val Leu Ala lie Asn Gly Gly Gly Tyr Pro Arg Cys 
20 25 30 

AAC TGC GAA GAC GAA GGA AAC AGT TTC TGG AGT ACA GAG AAC ATT CTA 144 
Asn Cys Glu Asp Glu Gly Asn Ser Phe Trp Ser Thr Glu Asn lie Leu 
35 40 45 

GAA ACT CAA AG A GTA AGC GAT TTC TTA ATC GCA GTA GCT TAT TTC TCA 192 
Glu Thr Gin Arg Val Ser Asp Phe Leu He Ala Val Ala Tyr Phe Ser 
50 55 60 

ATC CCT ATT GAG TTA CTT TAC TTC GTG AGT TGT TCC AAT GTT CCA TTC 240 
He Pro He Glu Leu Leu Tyr Phe Val Ser Cys Ser Asn Val Pro Phe 
^5 70 75 80 

AAA TGG GTT CTC TTT GAG TTT ATC GCC TTC ATT GTT CTT TGT GGT ATG 288 
Lys Trp Val Leu Phe Glu Phe He Ala Phe He Val Leu Cys Gly Met 
85 90 95 

ACT CAT CTT CTT CAT GGT TGG ACT TAC TCT GCT CAT CCA TTT AGA TTA 336 
Thr His Leu Leu His Gly Trp Thr Tyr Ser Ala His Pro Phe Arg Leu 
100 105 110 

ATG ATG GCG TTT ACT GTT TTC AAG ATG TTG ACT GCT TTA GTC TCT TGT 384 
Met Met Ala Phe Thr Val Phe Lys Met Leu Thr Ala Leu Val Ser Cvs 
115 120 125 

GCT ACT GCG ATT ACG CTT ATP ACT TTG ATT CCT CTG CTT TTC AAA GTT 432 
Ala Thr Ala He Thr Leu He Thr Leu He Pro Leu Leu Leu Lys Val 
130 135 140 

AAA GTT AGA GAG TTT ATG CTT AAG AAG AAA GCT CAT GAG CTT GGT CGT 480 
Lys Val Arg Glu Phe Met Leu Lys Lys Lys Ala His Glu Leu Gly Aro 
145 150 155 160 

GAA GTT GGT TTC ATT TTC ATT AAG AAA GAG ACT GGC TTT CAT GTT CGT 528 
Glu Val Gly Leu He Leu He Lys Lys Glu Thr Gly Phe His Val Arg 
165 170 175 

ATC CTT ACT CAA GAG ATT CGT AAG TCT TTC GAT CGT CAT ACG ATT CTT 576 
Met L u Thr Gin Glu He Arg Lys Ser I^u Asp Arg His Thr He Leu 
180 185 190 
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TAT ACT ACT TTG GTT GAG CTT TCG AAG ACT TTA GGG TTG CAG AAT TGT 624 
Tvr Thr Thr Leu Val Glu L u Ser Lys Thr Leu Gly Leu Gin Asn Cys 
195 200 205 

GCG GTT TGG ATG CCG AAT GAC GGT GGA ACG GAG ATG GAT TTG ACT CAT 672 
Ala Val Trp Met Pro Asn Asp Gly Gly Thr Glu Met Asp Leu Thr His 
210 215 220 

GAG TTG AGA GGG AGA GGT GGT TAT GGT GGT TGT TCT GTT TCT ATG GAG 720 
Glu Leu Arg Gly Arg Gly Gly Tyr Gly Gly Cys Ser Val Ser Met Glu 
225 230 235 240 

GAT TTG GAT GTT GTT AGG ATT AGG GAG AGT GAT GAA GTG AAT GTC TTG 768 
Sp Leu Asp ^1 val Arg He Arg Glu Ser Asp Glu Val Asn Val Leu 
245 250 2S= 

Acrr GTT GAC TCG TCC ATT GCT CGA GCT AGT GGT GGT GGT GGG GAT GTT 816 
5S ^ Asl ?er ser lYe Ala Arg Ala Ser Gly Gly Gly Gly Asp Val 
260 265 270 

AGT GAG ATT GGT GCC GTG GCT GCT ATT AGA ATG CCG ATG CTT CGT GTT 864 
Ser Glu lie Gly Ala Val Ala Ala He Arg Met Pro Met Leu Arg Val 
275 280 285 

TCG GAT TTT AAT GGA GAG CTA AGT TAT GCG ATA CTT GTT TGT GTT TTA 912 
llr Isl Tsn Gly Glu Leu Ser lyr Ala He Leu Val Cys Val Leu 
290 295 300 

CCG GGC GGG ACC CGT CGG GAT TGG ACT TAT CAG GAG ATT GAG ATT GTT 960 
Pro Gly Gly Thr Arg Arg Asp Trp Thr Tyr Gin Glu He Glu He Val 
305 310 315 320 

AAA GTT GTG GCG GAT CAA GTA ACC GTT GCG TTA GAT CAT GCA GCG GTT 1008 
Lys Val Val Ala Asp Gin Val Thr Val Ala Leu Asp His Ala Ala Val 
325 330 335 

CTT GAA GAG TCT CAG CTT ATG AGG GAG AAG CTG GCG GAA CAG AAC AGG 1056 
Leu Glu Glu Ser Gin Leu Met Arg Glu Lys Leu Ala Glu Gin Asn Arg 
340 345 350 

GCG TTG CAG ATG GCG AAG AGA GAC GCG TTG AGA GCG AGC CAA GCG AGG 1104 
Ala Leu Gin Met Ala Lys Arg Asp Ala Leu Arg Ala Ser Gin Ala Arg 
355 360 365 

AAT GCC TTT CAG AAA ACG ATG AGC GAA GGG ATG AGO CGT CCT ATG CAT 1152 
Asn Ala Phe Gin Lys Thr Met Ser Glu Gly Met Arg Arg Pro Met Hxs 
370 375 380 

TCG ATA CTC GGT CTT TTG TCG ATG ATT CAG GAC GAG AAG TTG AGT GAC 1200 
Ser He Leu Gly Leu Leu Ser Met He Gin Asp Glu Lys Leu Ser Asp 
385 390 395 400 

GAS CAG AAA ATC ATT GTT GAT ACG ATG GTT AAA ACA GGG AAT GTT ATG 1248 
SlS Gin ijs Me^ He Val Asp Thr Met Val Lys Thr Gly Asn Val Met 
405 410 415 

TCG AAT TTO GTC GGG GAC TCT ATG GAT GTG CCT GAC GGT AGA TTT GGT 1296 
Ser Asn Leu Val Gly Asp Ser Met Asp Val Pro Asp Gly Arg Phe Gly 
420 425 430 

ACG GAG ATC AAA CCG TTT AGT CTC CAT CGT ACG ATC CAT GAA GCA GCT 1344 
Thr Glu Met Lys Pro Phe Ser Leu His Arg Thr He His Glu Ala Ala 
435 440 445 
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TGT ATG GCG AGA TGT TTG TGT CTA TGC AAT GGA ATT AGG TTC TTG GTT 13 92 
Cys Met Ala Arg Cys Leu Cys Leu Cys Asn Gly He Arg Phe Leu Val 
450 455 460 

GAC GCG GAG AAG TCT CTA CCT GAT AAT GTA GTA GGT GAT GAA AGA AGG 1440 
Asp Ala Glu Lys Ser Leu Pro Asp Asn Val Val Gly Asp Glu Arg Aro 
465 470 475 430 

GTC TTT CAA GTG ATA CTT CAT ATG GTT GGT AGT TTA GTA AAG CCT AGA 1488 
Val Phe Gin Val He Leu His Met Val Gly Ser Leu Val Lys Pro Aro 
485 490 495 

AAA CGT CAA GAA GGA TCT TCA TTG ATG TTT AAG GTT TTG AAA GAA AGA 153 6 
Lys Arg Gin Glu Gly Ser Ser Leu Met Phe Lys Val Leu Lys Glu Ara 
500 505 510 

GGA AGC TTG GAT AGG AGT GAT CAT AGA TGG GCT GCT TCG AGA TCA CCG 1584 
Gly Ser Leu Asp Arg Ser Asp His Arg Trp Ala Ala Trp Arg Ser Pro 
515 520 525 

GCT TCT TCA GCA GAT GGA GAT GTG TAT ATA AGA TTT GAA ATG AAT GTA 1632 
Ala Ser Ser Ala Asp Gly Asp Val Tyr He Arg Phe Glu Met Asn Val 
530 535 540 

GAG AAT GAT GAT TCA AGT TCT CAA TCA TTT GCT TCT GTT TCC TCC AGA 1680 
Glu Asn Asp Asp Ser Ser Ser Gin Ser Phe Ala Ser Val Ser Ser Ara 

550 555 565 

GAT CAA GAA GTT GGT GAT GTT AGA TTC TCC GGC GGC TAT GGG TTA GGA 1728 
Asp Gin Glu Val Gly Asp Val Arg Phe Ser Gly Gly Tyr Gly Leu Glv 
565 570 575 

CAA GAT CTA AGC TTT GGT GTT TGT AAG AAA GTG GTG CAG TTC ATT CAT 1776 
Gin Asp Leu Ser Phe Gly Val Cys Lys Lys Val Val Gin Leu He His 
580 585 590 

GGG AAT ATC TCG GTC GTC CCT GGC TCG GAT GGT TCA CCG GAG ACC ATC 1824 
Gly Asn He Ser Val Val Pro Gly Ser Asp Gly Ser Pro Glu Thr Met 
595 600 605 

TCG TTC CTC CTT CGG TTT CGA CGT AGA CCC TCC ATA TCT GTC CAT GGA 1872 
Ser Leu Leu Leu Arg Phe Arg Arg Arg Pro Ser He Ser Val His Glv 
610 615 620 

TCC AGC GAG TCG CCA GCT CCT GAC CAC CAC GCT CAC CCA CAT TCG AAT 1920 
Ser Ser Glu Ser Pro Ala Pro Asp His His Ala His Pro His Ser Asn 
"5 630 635 640 

TCT CTC TTA CGT GGC TTA CAA GTT TTA TTC GTA GAC ACC AAC GAT TCG 1968 
Ser Leu Leu Arg Gly Leu Gin Val Leu Leu Val Asp Thr Asn Asp Ser 
645 650 655 

AAC CGG GCA GTT ACA CGT AAA CTC TTA GAG AAA CTC GGG TCC GAT GTA 2016 
Asn Arg Ala Val Thr Arg Lys Leu Leu Glu Lys Leu Gly Cys Asp Val 
660 665 670 

ACC GCG GTT TCC TCT GGA TTC GAT TCC CTT ACC GCC ATT GCT CCC GGC 2064 
Thr Ala Val Ser Ser Gly Phe Asp Cys Leu Thr Ala He Ala Pro Glv 
675 680 685 

TCG TCC TCG CCT TCT ACT TCG TTT CAA GTC GTC GTC CTT GAT CTT CAA 2112 
Ser Ser S r Pro Ser Thr Ser Ph Gin Val Val Val Leu Asp Leu Gin 
690 695 700 
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ATG GCA GAG ATG GAC GGT TAT GAA GTG GCC ATG AGG ATC AGG AGT CGA 2160 
Met Ala Glu Met Asp Gly Tyr Glu Val Ala Met Arg He Arg Ser Arg 
705 710 715 720 

TCT TGG CCG TTG ATT GTG GCG ACG ACA GTG AGC TTG GAT GAA GAA ATG 2208 
Ser Trp Pro Leu He Val Ala Thr Thr Val Ser Leu Asp Glu Glu Met 
725 730 735 

TGG GAC AAG TGT GCA CAG ATT GGA ATC AAT GGA GTT GTG AGA AAG CCA 2256 
Trp Asp Lys Cys Ala Gin He Gly He Asn Gly Val Val Arg Lys Pro 
^ ^ 740 745 750 

GTG GTG TTA AGA GCT ATG GAG AGT GAG CTC CGA AGA GTA TTG TTG CAA 2304 
Val Val Leu Arg Ala Met Glu Ser Glu Leu Arg Arg Val Leu Leu Gin 
755 760 765 

GCT GAC CAA CTT CTC TAAGTTGTTA TCTCAACITC TCTTCTACAT TCAAAATTTT 2359 
Ala Asp Gin Leu Leu 
770 

TACACCATAG ATTTATGTCA AATATATCAA AATGAAATTT CGAAA 2404 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 773 eunino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECXn-E TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Met Val Lys Glu He Ala Ser Trp Leu Leu He Leu Ser Met Val Val 
1 5 10 15 

Phe Val Ser Pro Val Leu Ala He Asn Gly Gly Gly Tyr Pro Arg Cys 
20 25 30 

Asn Cys Glu Asp Glu Gly Asn Ser Phe Trp Ser Thr Glu Asn He Leu 
35 40 45 

Glu Thr Gin Arg Val Ser Asp Phe Leu He Ala Val Ala Tyr Phe Ser 
SO 55 60 

He Pro He Glu Leu Leu Tyr Phe Val Ser Cys Ser Asn Val Pro Phe 
65 70 75 80 

Trp Val Leu Phe Glu Phe He Ala Phe He Val Leu Cys Gly Met 
85 ^0 95 

Thr His Leu Leu His Gly Trp Thr Tyr Ser Ala His Pro Phe Arg Leu 
100 105 110 

Met Met Ala Phe Thr Val Phe Lys Met Leu Thr Ala Leu Val Ser Cys 
115 120 125 

Ala Thr Ala II Thr Leu He Thr Leu He Pro Leu Leu Leu Lys Val 
130 135 140 

Lvs Val Arg Glu Phe Met Leu Lys Lys Lys Ala His Glu Leu Gly Arg 
145 150 155 160 
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Glu Val Gly Leu lie Leu He Lys Lys Glu Thr Gly Phe His Val Aro 
165 170 175 ^ 

Met Leu Thr Gin Glu He Arg Lys Ser Leu Asp Arg His Thr He Leu 
180 185 190 

Tyr Thr Thr Leu Val Glu Leu Ser Lys Thr Leu Gly Leu Gin Asn Cvs 
195 200 205 

Ala Val Trp Met Pro Asn Asp Gly Gly Thr Glu Met Asp Leu Thr His 
210 215 220 

Glu Leu Arg Gly Arg Gly Gly Tyr Gly Gly Cys Ser Val Ser Met Glu 
225 230 235 240 

Asp Leu Asp Val Val Arg He Arg Glu Ser Asp Glu Val Asn Val Leu 
245 250 255 

Ser Val Asp Ser Ser He Ala Arg Ala Ser Gly Gly Gly Gly Asd Val 
260 265 270 

Ser Glu He Gly Ala Val Ala Ala He Arg Met Pro Met Leu Arg Val 
275 280 285 

Ser Asp Phe Asn Gly Glu Leu Ser Tyr Ala He Leu Val Cys Val Leu 
290 295 300 

Pro Gly Gly Thr Arg Arg Asp Trp Thr Tyr Gin Glu He Glu He Val 
305 310 315 320 

Lys Val Val Ala Asp Gin Val Thr Val Ala Leu Asp His Ala Ala Val 
325 330 335 

Leu Glu Glu Ser Gin Leu Met Arg Glu Lys Leu Ala Glu Gin Asn Arg 
340 345 350 

Ala Leu Gin Met Ala Lys Arg Asp Ala Leu Arg Ala Ser Gin Ala Arg 
355 360 365 

Asn Ala Phe Gin Lys Thr Met Ser Glu Gly Met Arg Arg Pro Met His 
370 375 380 

Ser He Leu Gly Leu Leu Ser Met He Gin Asp Glu Lys Leu Ser Asp 
385 390 395 400 

Glu Gin Lys Met He Val Asp Thr Met Val Lys Thr Gly Asn Val Met 
405 410 415 

Ser Asn Leu Val Gly Asp Ser Met Asp Val Pro Asp Gly Arg Phe Gly 
420 425 430 

Thr Glu Met Lys Pro Phe Ser Leu His Arg Thr He His Glu Ala Ala 
435 440 445 

Cys Met Ala Arg Cys Leu Cys Leu Cys Asn Gly He Arg Phe Leu Val 
450 455 460 

Asp Ala Glu Lys Ser Leu Pro Asp Asn Val Val Gly Asp Glu Arg Ara 
465 470 475 480 

Val Ph Gin Val He Leu His Met Val Gly Ser Leu Val Lys Pro Arg 
485 490 495 
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Lys Arg Gin Glu Gly Ser Ser Leu Met Phe Lys Val Leu Lys Glu Arg 
500 

Gly ser Leu Asp Arg Ser Asp His Arg Trp Ala Ala Trp Arg Ser Pro 
515 520 525 

Ala Ser Ser Ala Asp Gly Asp Val Tyx He Arg Phe Glu Met Asn Val 
530 535 540 

Glu Asn Asp Asp Ser Ser Ser Gin Ser Phe Ala Ser Val Ser Ser Arg 
545 550 555 560 

Asp Gin Glu Val Gly Asp Val Arg Phe Ser Gly Gly Tyr Gly Leu Gly 
*^ 565 570 

Gin Asp Leu Ser Phe Gly Val Cys Lys Lys Val Val Gin Leu He His 
580 585 590 

Gly Asn He Ser Val Val Pro Gly Ser Asp Gly Ser Pro Glu Thr Met 
595 600 605 

Ser Leu Leu Leu Arg Phe Arg Arg Arg Pro Ser lie Ser Val His Gly 
610 615 620 

Ser Ser Glu Ser Pro Ala Pro Asp His His Ala His Pro His Ser Asn 
625 630 635 640 

Ser Leu Leu Arg Gly Leu Gin Val Leu Leu Val Asp Thr Asn Asp Ser 
645 650 655 

Asn Arg Ala Val Thr Arg Lys Leu Leu Glu Lys Leu Gly Cys Asp Val 
660 665 fc/0 

Thr Ala Val Ser Ser Gly Phe Asp Cys Leu Thr Ala He Ala Pro Gly 
675 680 685 

Ser Ser Ser Pro Ser Thr Ser Phe Gin Val Val Val Leu Asp Leu Gin 
690 695 700 

Met Ala Glu Met Asp Gly Tyr Glu Val Ala Met Arg He Arg Ser Arg 
705 710 715 720 

Ser Trp Pro Leu He Val Ala Thr Thr Val Ser Leu Asp Glu Glu Met 
725 730 735 

Trp Asp Lys Cys Ala Gin He Gly He Asn Gly Val Val Arg Lys Pro 

Val Val Leu Arg Ala Met Glu Ser Glu Leu Arg Arg Val Leu Leu Gin 
755 760 765 

Ala Asp Gin Leu Leu 
770 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join ( 564 1469 , 1565.. 1933, 2014.. 2280 2359 

..2486, 2577.. 2748) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

ACTTTTAAAA TTTCTTTATT TCATTGTCAG AAAAAGAGAG CTAATAATAT TATTATTTAA 60 

ATGTAACAAG TAGGCCTATA ACACGTGAAC TTCCCTCTTT GCAAAAAAAA AATCATCAAA 120 

AACTTTTACC TCTCATTGGT TTCTTCTTTA TCACACTGTT ACGCTTGGAT TCTCATTTCT 180 

TCAAGTTCAT AACGCTCGGA TCAATCAGGA AGACGAACTT GAACTTTCTT mTTCATCA 240 

TTACCCAAAG CTATGAGGCT CACACCACCA ATACGTCCGC CGTCATGAAT CCTTCTCTTC 300 

CAGGTACTGT GCCGTCTCGG GATAACAAAC TTTCTATTTA TTCTCTTCTG ATCGGATCTA 360 

TCTATCGATG AAGATTGATT TCACTACTTT AGTAACATTT CATCTGATCG ATCTGTGTTG 420 

TGTTATCGAG GAATCAATCT CATTTTGTAG ATTCAATTTT CTGGATAGAT TTTCTATCTC 480 

TTTTCCATAG CTCTAGTCCA AATCTAGTCT CCACTGATAT CTGAGTTTTC TTGACCAGGT 540 

CAACACAAGT CAGAGCTCCA AAA ATG GAG TCA TGC GAT TGT TTT GAG ACG 590 

Met Glu Ser Cys Asp Cys Phe Glu Thr 
1 5 

CAT GTG AAT CAA GAT GAT CTG TTA GTG AAG TAC CAA TAC ATC TCA GAT 638 
His Val Asn Gin Asp Asp Leu Leu Val Lys Tyr Gin Tyr lie Ser Asp 
10 15 20 25 

GCG TTG ATT GCT CTT GCA TAC TTC TCA ATC CCA CTC GAG CTT ATC TAT 686 
Ala Leu lie Ala Leu Ala Tyr Phe Ser lie Pro Leu Glu Leu lie Tvr 
30 35 40 

TTC GTG CAA AAG TCT GCT TTC TTC CCT TAC AAA TGG GTG CTT ATG CAG 734 
Phe Val Gin Lys Ser Ala Phe Phe Pro Tyr Lys Trp Val Leu Met Gin 
45 50 55 

TTT GGA GCC TTT ATC ATT CTC TGT GGA GCT ACG CAT TTC ATC AAC CTA 782 
Phe Gly Ala Phe He He Leu Cys Gly Ala Thr His Phe He Asn Leu 
60 65 70 

TGG ATG TTC TTC ATG CAT TCC AAA GCC GTT GCC ATT GTC ATG ACT ATT 830 
Trp Met Phe Phe Met His Ser Lys Ala Val Ala He Val Met Thr He 
75 80 85 

GCT AAA GTC TCT TGC GCG GTT GTG TCG TGT GCT ACC GCG TTC ATC TTG 878 
Ala Lys Val Ser Cys Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu 
50 95 100 105 

GTT CAT ATT ATT CCT GAT CTT CTC AGT GTT AAG AAC AGG GAA TTC TTT 926 
Val His He He Pro Asp Leu Leu Ser Val Lys Asn Arg Glu Leu Phe 
110 115 120 

CTC AAG AAG AAA GCT GAT GAG TTA GAT AGA GAA ATC GGT CTT ATT TTA 974 
Leu Lys Lys Lys Ala Asp Glu Leu Asp Arg Glu Met Gly Leu He Leu 
125 130 135 



wo 95/01439 



PCT/US94/07418 



128 

AC A CAA GAG GAG ACT GGT AGG CAT GTT AGG ATG CTT ACT CAT GGA ATT 1022 
Thr Gin Glu Glu Thr Gly Arg His Val Arg Met Leu Thr His Gly He 
140 145 150 

AGA AGA ACT CTT GAT AGG CAT ACT ATT TTA AGA ACC ACT CTT GTT GAG 1070 
Ara Arg Thr Leu Asp Arg His Thr He Leu Arg Thr Thr Leu Val Glu 
155 160 165 

CTT GGT AAA ACT CTT TGT CTT GAG GAA TGT GCG TTG TGG ATG CCT TCT 1118 
Leu Gly Lys Thr Leu Cys Leu Glu Glu Cys Ala Leu Trp Met Pro Ser 
170 175 180 185 

CAA AGT GGT TTA TAT TTG CAG CTT TCT CAT ACT TTG AGT CAT AAA ATA 1166 
Gin Ser Gly Leu Tyr Leu Gin Leu Ser His Thr Leu Ser His Lys He 
190 195 200 

CAA GTT GGA AGC AGT GTG CCG ATA AAT CTC CCG ATT ATT AAT GAA CTC 1214 
Gin Val Gly Ser Ser Val Pro He Asn Leu Pro He He Asn Glu Leu 
205 210 215 

TTC AAT AGC GCT CAA GCT ATG CAC ATA CCT CAT TCT TGT CCT TTG GCT 1262. 
Phe Asn Ser Ala Gin Ala Met His He Pro His Ser Cys Pro Leu Ala 
220 225 230 

AAG ATT GGG CCT CCG GTT GGG AGA TAT TCA CCT CCT GAG GTT GTT TCT 1310 
Lys He Gly Pro Pro Val Gly Arg Tyr Ser Pro Pro Glu Val Val Ser 
235 240 245 

GTC CGT GTT CCT CTT TTA CAT CTC TCT AAT TTC CAA GGC AGT GAC TGG 1358 
Val Arg Val Pro Leu Leu His Leu Ser Asn Phe Gin Gly Ser Asp Trp 
250 255 260 265 

TCG GAT CTC TCT GGC AAA GGT TAC GCT ATC ATG GTC CTG ATT CTC CCA 1406 
Ser Asp Leu Ser Gly Lys Gly Tyr Ala He Met Val Leu He Leu Pro 
270 275 280 

ACC GAT GGT GCA AGA AAA TGG AGA GAC CAT GAG TTA GAG CTT GTA GAA 1454 
Thr Asp Gly Ala Arg Lys Trp Arg Asp His Glu Leu Glu Leu Val Glu 
285 290 295 

AAC GTG GCG GAT CAG GTCCATCTCT TTACTTGTAT ATGTTTGGTT GTGTGTCAAG 1509 
Asn Val Ala Asp Gin 
300 

TTGCTTTACC AGCTTTTAGT GTTTTGTTTT GTCCCCTGAC TCTCACTTCA TTCAG GTG 1567 

Val 



GCT GTG GCT CTC TCA CAT GCT GCA ATT TTG GAA GAA TCC ATG CAC GCT 1615 
Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met His Ala 
305 310 315 

CGT GAC CAG CTT ATG GAG CAG AAT TTT GCT TTA GAC AAG GCT CGT CAA 1663 
Arg Asp Gin Leu Met Glu Gin Asn Phe Ala Leu Asp Lys Ala Arg Gin 
320 325 330 335 

GAG GCT GAG ATG GCA GTA CAT GCT CGA AAT GAT TTC CTA GCT GTT ATG 1711 
Glu Ala Glu Met Ala Val His Ala Arg Asn Asp Phe Leu Ala Val Met 
340 345 350 

AAC CAC GAG ATC AGG ACA CCG ATG CAT GCC ATC ATC TCT CTT TCT TCT 1759 
Asn His Glu Met Arg Thr Pro Met His Ala He He Ser Leu Ser Ser 
355 360 365 
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CTT CTC CTT GAG ACT GAG CTG TCT CCA GAG CAA AGA GTT ATC ATC GAG 18 07 
Leu Leu Leu Glu Thr Glu Leu Ser Pro Glu Gin Arg Val Met He Glu 
370 375 380 

ACA ATA CTG AAA AGC AGC AAT CTT GTG GCT ACA CTA ATC AGC GAC GTT Ifi^c: 
'^^^ III Ser Ser Asn Leu Val Ala Thr Leu He Ser Asp Val 

CTG GAT CTT TCG AGA TTG GAA GAT GGG AGC TPA CTC TTG GAA AAT GAA 190^ 
Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Leu Leu Glu Asn Glu 

405 410 

CCA TTC AGT CTA CAA GCG ATC TTT GAA GAG GTAACTAAAT CCCCCTGATT ^^K^ 
Pro Phe Ser Leu Gin Ala He Phe Glu Glu 
420 425 

AACCAGTGAA GTCCATTATA TATGTCTTAC ATGAATAACA TGGGCGCTTT GAATCTGCAG 2013 

GTC ATC TCT TTG ATA AAG CCA ATC GCA TCA GTG AAG AAA CTA TCA ACG 2061 
Val lie Ser Leu He Lys Pro He Ala Ser Val Lys Lys Leu Ser Thr 
430 435 

AAT CTG ATT CTG TCT GCA GAC TTA CCA ACT TAT GCT ATT GGT GAT GAG 
Asn Leu He Leu Ser Ala Asp Leu Pro Thr Tyr Ala He Gly Asp Glu 
445 450 455 

AAA CGT CTG ATG CAA ACA ATT CTT AAC ATC ATG GGC AAC GCT GTG AAA 2157 
Lys Arg Leu Met Gin Thr He Leu Asn He Met Gly Asn Ala Val Ia^s 
460 465 470 

TTT ACT AAG GAA GGC TAC ATC TCC ATA ATA GCC TCT ATC ATG AAA CCC 220=; 
Phe Thr Lys Glu Gly Tyr He Ser He He Ala Ser He Met^s Pro 
475 480 485 

GAG TCC TTA CAA GAA TTA CCA TCT CCA GAA TTT TTT CCA GTT CTC AGT 225"5 
490 ^^"^ its ^^"^ ^^"^ loo ^""^ ^""^ Leu Ser 

GAC AGT CAC TTC TAC CTA TGT GTG CAG GTTAGACCCA ATCTACAAAT 23 00 

Asp Ser His Phe Tyr Leu Cys Val Gin 
510 

TACTAAACTA CAAAGTTAAG CTTCTTACTG TGTTCTTACT GTTATAATCA TGGTGCAG 23 58 

GTG AAG GAC ACA GGG TGT GGA ATT CAC ACA CAA GAC ATT CCT ITC CTC 2406 
Val Lys Asp Thr Gly Cys Gly He His Thr Gin Asp He Pro Leu Leu 

520 525 530 

TTT ACC AAA TTT GTA CAG OCT CGC ACC GGA ACT CAG AGG AAC CAT TCC 2454 
Phe Thr Lys Phe Val Gin Pro Arg Thr Gly Thr Gin Arg Asn His Ser 
535 540 545 

GGT GGA GGA CTC GCG CTA GCT CTC TCT AAA CG GTAACAACCC 2d0fi 
Gly Gly Gly Leu Gly Leu Ala I^u Cys Lys Arg 
550 555 

AAAAGTATAT ATAAGTTATA AGCAGATGGT GTTACAAATA GCTAAAAGGC AAGnTCTGT 2556 

TGATGGATGT CTCTGGTTAG G TTT GTC GGG CTA ATG GGA GGA TAC ATG TCG 2607 

Phe Val Gly Leu Met Gly Gly Tyr Met Trp 
560 565 
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ATA GAA AGT GAA GGC CTA GAG AAA GGC TGC ACA GCT TCG TTC ATC ATC 2655 
lie Glu ser Glu Gly Leu Glu Lys Gly Cys Thr Ala Ser Ph lie lie 
570 5'75 580 

AGG CTT GGT ATC TCC AAC GGT CCA AGC AGT AGC AGT GGT TCA ATC GCG 27 03 
Leu Gly lie Cys Asn Gly Pro Ser Ser Ser Ser Gly Ser Met Ala 
585 590 595 

CTA CAT CTT GCA GCT AAA TCA CAA ACC AGA CCG TGG AAC TGG TCATACTrAC2755 
Leu His Leu Ala Ala Lys Ser Gin Thr Arg Pro Trp Asn Trp 
600 605 610 

GTTOGAAAGA CTTGTATTCA GGTCAGACIT TTTAACTACA CAGCAGCAAG AGAAAGAAGA 2815 
AAATACATGA CCGGACGGTC TCATCTAACT TATTGGATTT TGTTGGATGT AATATCTAAA 2875 
ATAAAAATCC TATATACGGG CAGAGGTACC TTATCTGTTC TCACTATATT TTATTGAACA 2935 
TTACTTTAGA GAATATGTTT TCGAATTCAC TACTAAATAA ACGATATAAA tCTTCACGAA 2995 

3009 

AAGAGCAACA TTTT 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Met Glu Ser Cys Asp Cys Phe Glu Thr His Val Asn Gin Asp Asp Leu 
1 5 10 ^5 

Leu Val Lys Tyr Gin Tyx lie Ser Asp Ala Leu lie Ala Leu Ala Tyx 
20 25 30 

Phe Ser lie Pro Leu Glu Leu He Tyr Phe Val Gin Lys Ser Ala Phe 
35 40 45 

Phe Pro Tyr Lys Trp Val Leu Met Gin Phe Gly Ala Phe He He Leu 
50 55 60 

Cys Gly Ala Thr His Phe He Asn Leu Trp Met Phe Phe Met His Ser 

Lys Ala Val Ala He Val Met Thr He Ala Lys Val Ser Cys Ala Val 
' 85 90 95 

Val ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 105 lio 

Leu Ser Val Lys Asn Arg Glu Leu Phe Leu Lys Lys Lys Ala Asp Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Leu Thr Gin Glu Glu Thr Gly Arg 
130 135 140 

His Val Arg Met Leu Thr His Gly He Arg Arg Thr Leu Asp Arg His 
145 150 155 160 
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Thr lie Leu Arg Thr Thr Leu Val Glu Leu Gly Lys Thr Leu Cys Leu 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Ser Gin Ser Gly Leu Tyi Leu Gin 
180 185 190 

Leu Ser His Thr Leu Ser His Lys He Gin Val Gly Ser Ser Val Pro 
195 200 205 

He Asn Leu Pro lie He Asn Glu Leu Phe Asn Ser Ala Gin Ala Met 

His lie Pro His Ser Cys Pro Leu Ala Lys He Gly Pro Pro Val Glv 

230 235 240 

Arg Tyr Ser Pro Pro Glu Val Val Ser Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin Gly Ser Asp Trp Ser Asp Leu Ser Gly Lys Gly 
260 265 270 

|1| Met Val Leu He Leu Pro Thr Asp Gly Ala Arg Lys Trp 
^9 Jlo 295 '^^^ ilt 

Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met His Ala Ara 

310 315 320 

Asp Gin Leu Met Glu Gin Asn Phe Ala Leu Asp Lys Ala Arg Gin Glu 
325 330 

Ala Glu Met Ala Val His Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thx Pro Met His Ala He He Ser Leu Ser Ser Leu 
355 360 365 

Leu Leu Glu Thr Glu Leu Ser Pro Glu Gin Arg Val Met He Glu Thr 
3'" 375 380 

He Leu Lys Ser Ser Asn Leu Val Ala Thr Leu He Ser Asp Val Leu 
3S5 390 395 ^ 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Leu Leu Glu Asn Glu Pro 
405 410 415 

Phe Ser Leu Gin Ala He Phe Glu Glu Val He Ser Leu He Lys Pro 
420 425 430 

He Ala Ser Val Lys Lys Leu Ser Thr Asn Leu He Leu Ser Ala Aso 
435 440 445 *^ 

Leu Pro Thr Tyr Ala He Gly Asp Glu Lys Arg Leu Met Gin Thr He 
450 455 4go 

Leu Asn He Met Gly Asn Ala Val Lys Phe Thr Lys Glu Gly Tyr He 
465 470 475 ' 430 

Ser II He Ala Ser He Met Lys Pro Glu Ser Leu Gin Glu Leu Pro 
485 490 495 



wo 95/01439 



PCT/US94/07418 



132 

Ser Pro Glu Phe Ph Pro Val Leu Ser Asp Ser His Phe Tyr Leu Cys 



500 



Val Gin Val Lys Asp Thr Gly Cys Gly He His Thr Gin Asp He Pro 
515 520 ^'^^ 

Leu Leu Phe Thr Lys Phe Val Gin Pro Arg Thr Gly Thr Gin Arg Asn 
530 535 

His ser Gly Gly Gly Leu Gly Leu Ala Leu Cys Lys Arg Phe Val Gly 

545 550 

Leu Met Gly Gly Tyr Met Trp He Glu Ser Glu Gly Leu Glu Lys Gly 

cys Thr Ala Ser Phe lie He Arg Leu Gly He Cys Asn Gly Pro Ser 

580 

Ser Ser Ser Gly Ser Met Ala Leu His Leu Ala Ala Lys Ser Gin Thr 

595 600 



Arg Pro Trp Asn Trp 
610 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2314 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 224. .2065 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
AAAAAAATCA TCAAAAACTT TTACCTCTCA TTGGTrrCTT CTTTATCACA CTGTTACGCT 60 
TCGATTCTCA TTTCrTCAAG TICATAACGC TCGGATCAAT CAGGAAGACG AACTTGAACT 120 

TrcTrmrr catcattacc caaagctatc aggctcacac caccaatacg tccgccgtca 180 

TCAATCCTTC TCTTCCAGGT CAACACAAGT CAGAGCICCA AAA ATC^^AG^^^A ^ 235 

1 

GAT TCT TIT GAG ACG CAT GTC AAT CAA GAT GAT CTG TTA GTG AAG TAC 283 
Sp Ss ^ 0?u Thr His val Asn Gin Asp Asp Leu Leu Val Lys IVr 
5 10 

CAA TAC ATC TCA GAT GCG TTC ATT CCT CTT GCA TAC TTC T^ ATC COV 331 
%^ -^T TlL ser ASP Ala Leu He Ala Leu Ala Tyr Phe Ser He Pro 

25 ^0 

rrc GAG err ATC TAT TTC GTG CAA AAG TCT GCT TTC TTC CCT TAC AAA 379 
He ^'r Phe Val Gin Lys Ser Ala Phe Phe Pro IVr Lys 
40 45 
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427 



TGG GTG CTT ATG CAG TTT GGA GCC TTT ATC ATT CTC TGT GGA GCT ACG 
Trp Val Leu Met Gin Phe Gly Ala Phe He He Leu Cys Gly Ala Thr 
55 60 65 

CAT TTC ATC AAC CTA TGG ATG TTC TTC ATG CAT TCC AAA GCC GTT GCC 4 7=; 
His Phe He Asn Leu Trp Met Phe Phe Met His Ser Lys Ala Val Ala 

75 80 

ATT GTC ATG ACT ATT GCT AAA GTC TCT TGC GCG GTT GTG TCG TGT GCT ^:>ri 
lie Val Met Thr He Ala Lys Val Ser Cys Ala Val^l Ser^s Sa 

^0 9S 100 

ACC GCG TTG ATG TTG GTT CAT ATT ATT CCT GAT CTT CTC AGT GTT AAG 571 
Thr Ala Leu Met Leu Val His He He Pro Asp Leu Leu Ser ValLvs 
105 110 ^ 

AAC AGG GAA TTG TTT CTC AAG AAG AAA GCT GAT GAG TTA GAT AGA GAA 619 
Asn Arg Glu Leu Phe Leu Lys Lys Lys Ala Asp Glu Leu Asp Arg Glu 
120 125 130 

ATG GGT CTT ATT TTA ACA CAA GAG GAG ACT GGT AGG CAT GTT AGG ATG 
Met Gly Leu He Leu Thr Gin Glu Glu Thr Gly Arg His Val iS-g Met 
135 ■ 140 

CTT ACT CAT GGA ATT AGA AGA ACT CTT GAT AGG CAT ACT ATT TTA AGA 71^ 
Leu Thr His Gly He Arg Arg Thr Leu Asp Arg His Thr lie Le6 IS-g 
1=0 155 160 

ACC ACT CTT GTT GAG CTT GGT AAA ACT CTT TGT CTT GAG GAA TCT GCG 763 
Thr Thr Leu Val Glu Leu Gly Lys Thr Leu Cys Leu Glu Glu^s Ala 

I'^O 175 180 

TTC TGG ATG CCT TCT CAA AGT GGT TTA TAT TTC CAG CTT TCT CAT ACT 811 
Leu Trp Met Pro Ser Gin Ser Gly Leu Tyr Leu Gin Leu Ser His Wu: 
185 190 

TTC AGT CAT AAA ATA CAA GTT GGA AGC AGT GTC CCG ATA AAT CTC CCG 859 
Leu Ser His Lys lie Gin Val Gly Ser Ser Val Pro Ile^n l^u Pro 
200 205 210 

ATT ATT AAT GAA CTC TTC AAT AGC GCT CAA GCT ATC CAC ATA CCT CAT 907 
He He Asn Glu Leu Phe Asn Ser Ala Gin Ala Met His He Pro His 
215 220 225 

TCT TCT CCT TTC GCT AAG ATT GGG CCT CCG GTT GOG AGA TAT TCA CCT 955 
Ser Cys Pro Leu Ala Lys He Gly Pro Pro Val Gly Arg iyr^r ^o 
230 235 240 

CCT GAG CTT GTT TCT GTC CCT GTT CCT CTT TTA CAT CTC TCT AAT TTC 1003 
Pro Glu Val Val Ser Val Arg Val Pro Leu Leu His Leu Ser^n Phe 
2«5 250 255 260 

CAA GGC ACT GAC TGG TCG GAT CTC TCT GGC AAA GGT TAC GCT ATC ATC 1051 
Gin Gly Ser Asp Trp Ser Asp Leu Ser Gly Lys Gly Tyr Ala He Met 
265 270 275 

GTC CTC ATT CTC CCA ACC GAT GCT GCA ACA AAA TGG AGA GAC CAT GAG 1099 
Val Leu He Leu Pro Thr Asp Gly Ala Arg Lys Trp Arg Asp His Glu 
280 285 290 

TTA GAG CTT CTA GAA AAC GTC GCG GAT CAG GTC GCT GTC GCT CTC TCA 1147 
Leu Glu Leu Val Glu Asn Val Ala Asp Gin Val Ala Val Ala L u Ser 
295 300 305 
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TAT GCT GCA ATT TTG GAA GAA TCC ATG CAC GCT CGT GAC CAG CTT ATG 1195 
iis All Ala iTl Glu Glu Ser Met His Ala Arg Asp Gin Leu Met 

310 315 320 

r&r TAG AAT TTT GCT TTA GAC AAG GCT CGT CAA GAG GCT GAG ATG GCA 1243 
S2 a" Y^u ASP Lys Ala Arg Gin Glu Ala Glu Met Ala 

325 330 335 340 

GTA CAT GCT CGA AAT GAT TTC CTA GCT GTT ATG AAC CAC GAG ATG AOS 1291 
vll His Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu Met Arg 
345 350 J== 

ACA CCG ATC CAT GCC ATC ATC TCT CTT TCT TCT CTT CTC CTT GAG ACT 1339 
Thr Pro Met His Ala He He Ser Leu Ser Ser Leu Leu Leu Glu Thr 
360 365 

GAG CTG TCT CCA GAG CAA AGA GTT ATG ATC GAG ACA ATA CTG AAA AGC 1387 
llu Leu ^ Pro Glu Gin Arg Val Met He Glu Thr lie Leu Lys Ser 
375 380 385 

AGC AAT CTT GTC GCT ACA CTA ATC AGC GAC GTT CTG GAT CTT TCG AGA 1435 
Ser Asn Leu Val Ala Thr Leu lie Ser Asp Val Leu Asp Leu Ser Arg 
390 395 400 

TTG GAA GAT GGG AGC TTA CTC TTG GAA AAT GAA CCA TTC ACT CTA CAA 1483 

%^ ?S ser Yeu Leu Leu Glu Asn Glu Pro Phe Ser Leu Gin 
405 410 415 420 

GCG ATC TTT GAA GAG GTC ATC TCT TTG ATA AAG CCA ATC GCA TCA GTG 1531 
Ala He Phe Glu Glu Val He Ser Leu He Lys Pro He Ala Ser Val 
425 430 435 

AAG AAA CTA TCA ACG AAT CTG ATT CTG TCT GCA GAC TTA CCA ACT TAT 1579 
Lys lJs Leu Ser Thr Asn Leu He Leu Ser Ala Asp Leu Pro Thr Tyr 
440 445 450 

GCT ATT GGT GAT GAG AAA CGT CTG ATG CAA ACA ATT CTT AAC ATC ATG 1627 
Ala He Gly Asp Glu Lys Arg Leu Met Gin Thr He Leu Asn He Met 
455 460 465 

GGC AAC GCT GTG AAA TTT ACT AAG GAA GGC TAC ATC TCC ATA ATA GCC 1675 
Gly Asn Ala Val Lys Phe Thr Lys Glu Gly Tyr He Ser He He Ala 
470 475 480 

TCT ATC ATG AAA CCC GAC TCC TTA CAA GAA TTA CCA TCT COi GAA TCT 1723 
ser He Met Lys Pro Glu Ser Leu Gin Glu Leu Pro Ser Pro Glu Phe 
485 490 495 500 

TTT CCA GTT CTC ACT GAC ACT CAC TTC TAC CTA TGT CTG CAG CTG AAC 1771 
P^ ^1 S S«^p ser His Phe Tyr Leu Cys Val Gin Val Lys 
505 510 515 

GAC ACA GGG TOT GGA ATT CAC ACA CAA GAC ATT CCT TTG CTC TTT ACC 1819 
Asp ^ eg ^ Gly He His Thr Gin Asp He Pro Leu Leu Phe Thr 

AAA TTT GTA CAG CCT CCG ACC GGA ACT CAG AGG AAC CAT TCC GGT GGA 1867 
toy ^ vll Gin Pro Arg Thr Gly Thr Gin Arg Asn His Ser Gly Gly 
535 540 545 

nra. CTC GGG CTA GCT CTC TGT AAA CGG TTT GTC GGG CTA ATG GGA GGA 1915 
i!J 2S ^ u Ala Yeu Cys Lys Arg Phe Val Gly Leu Met Gly Gly 

550 555 5^0 
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TAC ATG TGG ATA GAA AGT GAA GGC CTA GAG AAA GGC TGC AC A GCT TCG 19 63 
Tyr Met Trp lie Glu Ser Glu Gly Leu Glu Lys Gly Cys Thr Ala Ser 
565 570 575 580 

TTC ATC ATC AGG CTT GGT ATC TGC AAC GGT CCA AGC AGT AGC AGT GGT 2011 
Phe lie lie Arg Leu Gly lie Cys Asn Gly Pro Ser Ser Ser Ser Glv 
585 590 595 

TCA ATG GCG CTA CAT CTT GCA GCT AAA TCA CAA ACC AGA CCG TGG AAC 2059 
Ser Met Ala Leu His Leu Ala Ala Lys Ser Gin Thr Arg Pro Trp Asn 
600 605 610 

TGG TGATACTTAC GTTGGAAAGA CTTGTATTGA GGTGAGACTT TTTAACTACA 2112 
Trp 

CAGCAGCAAG AGAAAGAAGA AAATACATGA CCGGACGGTG TGATCTAACT TATTGGATTT 2172 
TGTTGGATGT AATATGTAAA ATAAAAATCC TATATACGGG GAGAGGTACC TTATCTGTTC 2232 
TCACTATATT TTATTGAACA TTACTTTAGA GAATATGTTT TGGAATTCAC TACTAAATAA 2292 
ACGATATAAA TCTTCACGAA AA 2314 



(2) INFORMATION FOR SEQ ID NO: 48: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Met Glu Ser Cys Asp Cys Phe Glu Thr His Val Asn Gin Asp Asp Leu 
15 10 15 

Leu Val Lys Tyr Gin Tyr lie Ser Asp Ala Leu He Ala Leu Ala Tvr 
20 25 30 

Phe Ser He Pro Leu Glu Leu He Tyr Phe Val Gin Lys Ser Ala Phe 
35 40 45 

Phe Pro Tyr Lys Trp Val Leu Met Cln Phe Gly Ala Phe He He Leu 
50 55 60 

Cys Gly Ala Thr His Phe He Asn Leu Trp Met Phe Phe Met His Ser 
65 70 75 80 

Lys Ala Val Ala He Val Met Thr He Ala Lys Val Ser Cys Ala Val 
85 90 95 

Val Ser Cys Ala Thr Ala Leu Met Leu Val His He He Pro Asp Leu 
100 105 110 

Leu Ser Val Lys Asn Arg Glu Leu Phe Leu Lys Lys Lys Ala Asp Glu 
115 120 125 

Leu Asp Arg Glu Met Gly Leu He Leu Thr Gin Glu Glu Thr Gly Ara 
130 135 140 



wo 95/01439 



PCT/US94/07418 



136 

His Val Arg Met Leu Thr His Gly He Arg Arg Thr Leu Asp Arg His 
145 150 155 160 

Thr He Leu Arg Thr Thr Leu Val Glu Leu Gly Lys Thr Leu Cys Leu 
165 170 175 

Glu Glu Cys Ala Leu Trp Met Pro Ser Gin Ser Gly Leu Tyr Leu Gin 
180 185 190 

Leu Ser His Thr Leu Ser His Lys He Gin Val Gly Ser Ser Val Pro 
195 200 205 

He Asn Leu Pro He He Asn Glu Leu Phe Asn Ser Ala Gin Ala Met 
210 215 220 

His He Pro His Ser Cys Pro Leu Ala Lys He Gly Pro Pro Val Gly 
225 230 235 240 

Arg Tyr Ser Pro Pro Glu Val Val Ser Val Arg Val Pro Leu Leu His 
245 250 255 

Leu Ser Asn Phe Gin Gly Ser Asp Trp Ser Asp Leu Ser Gly Lys Gly 
260 265 270 

Tyr Ala He Met Val Leu He Leu Pro Thr Asp Gly Ala Arg Lys Trp 
275 280 285 

Arg Asp His Glu Leu Glu Leu Val Glu Asn Val Ala Asp Gin Val Ala 
290 295 300 

Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met His Ala Arg 
305 310 315 320 

Asp Gin Leu Met Glu Gin Asn Phe Ala Leu Asp Lys Ala Arg Gin Glu 
325 330 335 

Ala Glu Met Ala Val His Ala Arg Asn Asp Phe Leu Ala Val Met Asn 
340 345 350 

His Glu Met Arg Thr Pro Met His Ala He He Ser Leu Ser Ser Leu 
355 360 365 

Leu Leu Glu Thr Glu Leu Ser Pro Glu Gin Arg Val Met He Glu Thr 
370 375 380 

He Leu Lys Ser Ser Asn Leu Val Ala Thr Leu He Ser Asp Val Leu 
385 390 395 400 

Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Leu Leu Glu Asn Glu Pro 
405 410 415 

Phe Ser Leu Gin Ala He Phe Glu Glu Val He Ser Leu He Lys Pro 
420 ^25 430 

He Ala Ser Val Lys Lys Leu Ser Thr Asn Leu He Leu Ser Ala Asp 
435 440 445 

Leu Pro Thr Tyr Ala He Gly Asp Glu Lys Arg Leu Met Gin Thr He 
450 455 460 

Leu Asn H Met Gly Asn Ala Val Lys Phe Thr Lys Glu Gly Tyr He 
465 470 475 480 
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Ser lie lie Ala Ser lie Met Lys Pro Glu Ser Leu Gin Glu Leu Pro 
485 490 495 

Ser Pro Glu Phe Phe Pro Val Leu Ser Asp Ser His Phe Tyr Leu Cys 
500 505 510 

Val Gin Val Lys Asp Thr Gly Cys Gly lie His Thr Gin Asp lie Pro 
515 520 525 

Leu Leu Phe Thr Lys Phe Val Gin Pro Arg Thr Gly Thr Gin Arg Asn 
530 535 540 

His Ser Gly Gly Gly Leu Gly Leu Ala Leu Cys Lys Arg Phe Val Gly 
545 550 555 560 

Leu Met Gly Gly Tyr Met Trp lie Glu Ser Glu Gly Leu Glu Lys Gly 
565 570 575 

Cys Thr Ala Ser Phe lie lie Arg Leu Gly lie Cys Asn Gly Pro Ser 
580 585 590 

Ser Ser Ser Gly Ser Met Ala Leu His Leu Ala Ala Lys Ser Gin Thr 
595 600 605 

Arg Pro Trp Asn Trp 
610 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2405 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : doxable 
(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 288.. 2196 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

TTTTTTTTTT GTCAAAAGCT CGATGTAAAA ATCCGATGGC CACAAGCAAA ACGACAGGTT 60 

CCAACTTCAC GGAGATTGTG AAAATGGACT AGTAGTTCAG TGAAGTAGTA GATACTGAGA 120 

TCGCATTCTC CGGCGTCGTT TTTCACATCG AAATAGTCGT GTAAAAAAAT GAAAAAATTC 180 

CTGCGAGACA GGTATGTGTC CCAGCAGGAA ATAGCATCTT AAAGGAAGGA ACGAACGAAA 240 

CTCGAAAGTT ACTAAAAATT TTTGATTCTT TGGGACGAAA CGAGATA ATG GAA TCC 296 

Met Glu Ser 
1 

TGT GAT TGC ATT GAG GCT TTA CTG CCA ACT GGT GAC CTG CTG GTT AAA 344 
Cys Asp Cys lie Glu Ala Leu Leu Pro Thr Gly Asp Leu Leu Val Lys 
5 10 15 

TAC CAA TAC CTC TCA GAT TTC TTC ATT GCT GTA GCC TAC TTT TCC ATT 3 92 
Tyr Gin Tyr Leu Ser Asp Phe Phe lie Ala Val Ala Tyr Phe S r lie 
20 25 30 35 
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CTG TTG GAG CTT ATT TAT TTT GTC CAC AAA TCT GCA TGC TTC CCA TAC 440 
Leu Leu Glu Leu lie Tyr Phe Val His Lys Ser Ala Cys Phe Pro Tyr 
40 45 50 



AGA TGG GTC CTC ATG CAA TTT GGT GCT TTT ATT GTG CTC TGT GGA GCA 
Arg Trp Val Leu Met Gin Phe Gly Ala Phe lie Val Leu Cys Gly Ala 
55 60 65 



488 



ACA CAC TTT ATT AGC TTG TGG ACC TTC TTT ATG CAC TCT AAG ACG GTC 536 
Thr His Phe He Ser Leu Trp Thr Phe Phe Met His Ser Lys Thr Val 
70 75 80 

GCT GTG GTT ATG ACC ATA TCA AAA ATG TTG ACA GCT GCC GTG TCC TGT 584 
Ala Val Val Met Thr He Ser Lys Met Leu Thr Ala Ala Val Ser Cys 
85 90 95 

ATC ACA GCT TTG ATG CTT GTT CAC ATT ATT CCT GAT TTG CTA AGT GTT 632 
He Thr Ala Leu Met Leu Val His He He Pro Asp Leu Leu Ser Val 
100 105 110 115 

AAA ACG CGA GAG TTG TTC TTG AAA ACT CGA GCT GAA GAG CTT GAC AAG 680 
Lvs Thr Arg Glu Leu Phe Leu Lys Thr Arg Ala Glu Glu Leu Asp Lys 
120 125 130 

GAA ATG GGC CTA ATA ATA AGA CAA GAA GAA ACT GGC AGA CAT GTC AGG 728 
Glu Met Gly Leu He He Arg Gin Glu Glu Thr Gly Arg His Val Arg 
135 140 145 

ATG CTG ACT CAT GAG ATA AGA AGC ACA CTC GAC AGA CAC ACA ATC TTG 776^ 
Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His Thr He Leu 
150 155 160 

AAG ACT ACT CTT GTG GAG CTA GGT AGG ACC TTA GAC CTG GCA GAA TGT 824 
Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Asp Leu Ala Glu Cys 
165 170 175 

GCT TTG TGG ATG CCA TGC CAA GGA GGC CTG ACT TTG CAA CTT TCC CAT 872 
Ala Leu Trp Met Pro Cys Gin Gly Gly Leu Thr Leu Gin Leu Ser His 
180 185 190 195 

AAT TTA AAC AAT CTA ATA CCT CTG GGA TCT ACT GTG CCA ATT AAT CTT 920 
Asn Leu Asn Asn Leu He Pro Leu Gly Ser Thr Val Pro He Asn Leu 
200 205 210 

CCT ATT ATC AAT GAA ATT TTT AGT AGC CCT CAA GCA ATA CAA ATT CCA 968 
Pro He He Asn Glu He Phe Ser Ser Pro Glu Ala He Gin He Pro 
215 220 225 

CAT ACA AAT CCT TTG CCA AGG ATG AGG AAT ACT GTT GGT AGA TAT ATT 1016 
His rtiz Asn Pro Leu Ala Arg Met Arg Asn Thr Val Gly Arg Tyr He 
230 235 240 

CCA CCA GAA GTA GTT GCT GTT CGT CTA CCG CTT TTA CAC CTC TCA AAT 1064 
Pro Pro Glu Val Val Ala Val Arg Val Pro Leu Leu His Leu Ser Asn 
245 250 255 

TTT ACT AAT GAC TGG GCT GAA CTG TCT ACT AGA AGT TAT CCG GTT ATG 1112 
Phe Thr Asn Asp Trp Ala Glu Leu Ser Thr Arg Ser Tyr Ala Val Met 
260 265 270 275 

GTT CTG GTT CTC CCG ATG AAT GGC TTA AGA AAG TGG CGT GAA CAT GAG 1160 
Val Leu Val Leu Pro Met Asn Gly Leu Arg Lys Trp Arg Glu His Glu 
280 285 290 
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TTA GAA CTT GTG CAA GTT GTC GCA GAT CAG GTT GCT GTC GCT CTT Tea T^no 
Leu Glu. Leu Val Gin Val Val Ala Asp Gin Val Ala vJlTlaTeu Ser 
^^=> 300 



Hi! f?I T^^ "^^^A GCC CAT GAT CAG CTC MG 

tlo "^'^ Ala His Asp Gin Leu T?e 

•'•^^ 315 320 

GAA CAG AAT ATT GCT TTG GAT GTA GCT CGA CAA GAA GCA GAG ATC crr 
Glu Gin Asn He Ala Leu Asp Val Ala Arg Gin Stu Sa Glu l?et Al 

330 335 



1256 



1304 



iTf III rrc CTT GCT GTG ATC AAC CAT GAA ATC AGA 

lie Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu Met ^g 

350 355 

ACG CCC ATG CAT GCA GTT ATT GCT CTG TCC TCT CTG CTT TTA GAA ACa 
Thr Pro Met His Ala Val lie Ala Leu Cys Ser Leu SuTeu Glu 4r 



1352 



1400 



1448. 



GAC TTA ACT CCA GAG CAG AGA GTT ATG ATT GAG ACC ATA ttt j^i^n 
Asp Leu Thr Pro Glu Gin Arg Val Met JTe G?u''5Sr''lleTeu^^^ 

380 385 

AGC AAT CTT CTT GCA ACA CTG ATA AAT GAT GTT CTA GAT CTT TCT ar-a 1 
ser Asn Leu Leu Ala Thr Leu lie Asn Asp Val Ltu°lIp'TIJ7er'''S:g 

395 

CTT GAA GAT GGT ATT CTT GAA CTA GAA AAC GGA ACA TTC AAT CTT CAT 1 Q^. 
Leu Glu ASP Gly He Leu Glu Leu Glu Asn Gly T^r'ISe^nTeuTis 

410 415 

430 435 

AAG AAA TTA TCT ATA ACT CTT GCT TTG GCT CTC GAT TTA CCT ATT Ctt i c-o 
Lys Lys Leu Ser lie Thr Leu Ala Leu Ala Leu Sp L^u "So 7leTeu 
440 445 



450 



1688 



GCT GTG GGT GAT GCA AAA CGT CTT ATC CAA ACT CTC TTA AAC GTC GTT 
Ala Val Gly Asp Ala Lys Arg Leu He Gin Thr^u L^uTsnTaf "^al 

460 

GGA AAT GCT GTG AAG TTC ACT AAA GAA GGA CAT ATT TCA ATT GAG rrr n:.^ 
Gly Asn Al* Val Lys Phe Thr Lys Glu Gly His He S^r "e Glu Ma ^ 

475 480 

TCA GTT GCC AAA CCA GAG TAT GCG AGA GAT TCT CAT CCT CCT GAA ATC no a 
Ser Val Ala Lys Pro Glu lyr Ala Arg Asp Cys hIs'^So Tro^^lu^^et 
^ 490 

S[f t*^ ^^'^ TTT TAT TTC CGT GTC CAG GTT ACA 1832 

Phe Pro Met Pro Ser Asp Gly Gin Phe Tyr Leu Arg Val Gln^rSrg " 

SIO 525 

fol *;CT GGG TCT GGA ATT AGC CCA CAA GAT ATA CCA CTA GTA TTC ACC 1880 
Asp Thr Gly cys Gly lie Ser Pro Gin Asp He Pro^u ValThe 4r 

iJi SI ^ ^f? ^ ACT GGA GGO GAA 1928 

Lys Phe Ala Glu Ser Arg Pro Thr Ser Asn Arg Ser Thr Gly Gly Glu 
535 540 545 
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1976 



550 555 

580 585 

600 

GCT CAA CGC TAT CAA AGA AGT ATG TAA A TGACAAAAGG ACATTGGTGT 2216. 
Ala Gin Arg Tyr Gin Arg Ser Met 
630 "5 
GACAAAGAAC ATTAAATCAT GACTAGTGAA TTTGAGATTT CTTCACTGTT CTGTACACTC 2276 
CAAATCGCAC AGTITCTCTr GTAACTAACC TAATTCAATG CTCGTAAAGT GAGTACTGGA 2336 
GTATCTWAA AAIGTAACTA TCGAATTTAT ACATCGAGCT TTTGACAAAA AAAAAAAAAA 2396 

2405 

AAAAAAAAA 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 636 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Glu ser Cys Asp Cys He Glu Ala Leu Leu Pro Thr Gly Asp Leu 

1 5 
Leu val Lys Tyr Gin Tyr Leu Ser Asp Phe Phe He Ala Val Ala Tyr 



20 



Phe Ser He Leu Leu Glu Leu lie Tyr Phe Val His Lys Ser Ala Cys 
35 



Phe Pro Tyr Arg Trp Val Leu Met Gin Phe Gly Ala Phe He Val Leu 

50 55 
cys Gly Ala Thr His Phe He Ser Leu Trp Thr Phe Phe Met His Ser 

70 " 



65 

Lys Thr Val Ala Val Val Met Thr He Ser Lys Met Leu Thr Ala Ala 

8S 
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Val Ser Cys lie Thr Ala Leu Met Leu Val His He He Pro Asp Leu 

Leu ser Val Lys Thr Arg Glu Leu Phe Leu Lys Thr Arg Ala Glu Glu 

120 125 
Leu Asp Lys Glu Met Gly Leu lie He Arg Gin Glu Glu Thr Gly Arg 

His val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg His 

160 

Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu Asp Leu 

170 175 

Ala Glu cys Ala Leu Trp Met Pro Cys Gin Gly Gly Leu Thr Leu Gin 
" 185 

Leu ser His Asn Leu Asn Asn Leu He Pro Leu Gly Ser Thr Val Pro 

" 200 205 

He Asn Leu Pro He He Asn Glu He Phe Ser Ser Pro Glu Ala He 

^■i-=> 220 

Gin He Pro His Thr Asn Pro Leu Ala Arg Met Arg Asn Thr Val Gly 

Arg Tyr He Pro Pro Glu Val Val Ala Val Arg Val Pro Leu Leu His 

255 

Leu Ser Asn Phe Thr Asn Asp Trp Ala Glu Leu Ser Thr Arg Ser TVr 

265 270 
Ala val Met Val Leu Val Leu Pro Met Asn Gly Leu Arg Lys Trp Arg 

285 

Glu His Glu Leu Glu Leu Val Gin Val Val Ala Asp Gin Val Ala Val 

300 

Ala Leu Ser His Ala Ala He Leu Glu Asp Ser Met Arg Ala His Asp 

320 

Gin Leu Met Glu Gin Asn He Ala Leu Asp Val Ala Arg Gin Glu Ala 

•»30 

Glu Met Ala He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His 

345 

Glu Met Ar| Thr Pro Met His Ala Val He Ala Leu Cys Ser Leu Leu 

3 65 

Leu Glu Asp Leu Thr Pro Glu Gin Arg Val Met He Glu Thr He 

J/=> 380 

Leu Lys Ser Ser Asn ^u Leu Ala Thr Leu lie Asn Asp Val Leu Asp 



400 



Leu ser Arg Leu Glu Asp Gly He Leu Glu Leu Glu Asn Gly Thr Phe 

410 

Asn Leu His Gly He Leu Arg Glu Ala Val Asn Leu He Lys Pro He 

425 430 
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Ala Ser 

Pro lie 
450 

Asn Val 
465 

He Glu 

Pro Glu 

Gin Val 

Val Phe 
530 

Gly Gly 
545 

Met Lys 
Thr Val 
Leu Pro 



Leu Lys 
435 

Leu Ala 
Val Gly 
Ala Ser 



Met Phe 
500 

Arg Asp 
515 



Asp Asp 
610 

Ser Val 
625 



Thr Lys 

Glu Gly 

Gly Asn 

Thr Phe 
580 

Leu Leu 
595 

Leu Phe 
Asn Ala 



Lys Leu 

Val Gly 

Asn Ala 
470 

Val Ala 
485 

Pro Met 
Thr Gly 
Phe Ala 



Ser He 
440 

Asp Ala 
455 

Val Lys 
Lys Pro 
Pro Ser 
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Thr Leu Ala Leu 



Ala Leu Asp Leu 
445 



cys Gly 
520 

Glu Ser 
535 



Leu Gly 
550 

He Trp 
565 

Val Val 
Pro Met 
Arg Tyr 



Gin Arg 
630 



Leu Ala 
He Glu 
Lys Leu 



Lys Arg 

Phe Thr 

Glu Tyr 
490 

Asp Gly 
505 

He Ser 
Arg Pro 
He Trp 



Leu He Gin Thr Leu Leu 
460 



Lys Glu 
475 

Ala Arg 
Gin Phe 
Pro Gin 



Gly His He Ser 
480 

Asp Cys His Pro 
495 

Tyr Leu Arg Val 
510 

Asp He Pro Leu 
525 



Thr Ser Asn Arg Ser Thr 
540 



Ser Glu 
570 

Gly He 
585 



Arg Gly 
Phe Arg 
Tyr Gin Arg Ser 



Pro Pro 
600 

Arg Gin 
615 



Arg Arg 
555 

Gly Pro 
Cys His 
Arg Leu 



Phe He Gin Leu 
560 

Gly Lys Gly Thr 
575 

His Pro Asn Ala 
590 

Asn Lys Gly Ser 
605 



Gly Asp Asp Gly Gly Met 
620 

Met * 
635 
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WHAT IS cr.ATMFn Tg - 

1. An isolated nucleic acid comprising a plant 
nucleic acid. 



2 



ETR 



An isolated nucleic acid comprising a modified 
plant ETR nucleic acid containing the substitution 
insertion or deletion of one or more nucleotides of L 
precursor ETR nucleic acid. 



10 



15 



20 



25 



30 



3. The nucleic acid according to Claim 2 wherein said 
modified ETR nucleic acid encodes a modified ETR 
protein containing the substitution, insertion or 
deletion of one or more amino acid residues as compared 
to the precursor ETR protein encoded by said precursor 
ETR nucleic acid. 

4. A nucleic acid according to Claim 3 wherein said 
modified ETR protein comprises the substitution of at 
least one selected amino acid residue in said precursor 
ETR protein with a different amino acid and wherein 
said selected amino acid residue in said precursor ETR 
protein is equivalent to an amino acid residue selected 
from the group consisting of Ala-3l, Ile-62, Cys-65 and 
Ala-102 in the ETR protein from Arabidopsis thaliana. 

5. A recombinant nucleic acid comprising a promoter 
operably linked to a modified plant ETR nucleic acid. 

6. A recombinant nucleic acid according to Claim 5 
wherein said modified ETR nucleic acid contains the 
substitution, insertion or deletion of one or more 
nucleotides of a precursor ETR nucleic acid and wherein 
said promoter is heterologous to said precursor ETR 
nucleic acid and capable of causing expression of said 
modified ETR nucleic acid in a plant cell. 
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7 A recombinant nucleic acid according to Claim 6 
wherein said promoter comprises a tissue-specific or 
temporal-specific promoter. 

8. A recombinant nucleic acid according to Claim 6 
5 wherein said promoter is inducible. 

9. A plant cell transformed with the recombinant 
nucleic acid of Claim 6. 



10. 



A plant comprising the plant cell of Claim 9 



10 



11 A plant comprising at least one plant cell 
transformed with a modified ETR nucleic acid and having 
a phenotype characterized by a decrease in the response 
of said at least one transformed plant cell to ethylene 
as compared to a corresponding wild-type plant not 
containing said transformed plant cell. 

15 12. A plant according to Claim 11 wherein said 
modified ETR nucleic acid comprises the substitution, 
insertion or deletion of one or more nucleotides in a 
precursor ETR nucleic acid which results in the 
substitution, insertion or deletion of one or more 

20 amino acid residues in the modified ETR protein encoded 
by said modified ETR nucleic acid as compared to the 
precursor ETR protein encode by said precursor ETR 
nucleic acid. 

13 A plant according to Claim 12 wherein the 
modification in said precursor ETR nucleic acid 
comprises the substitution of one or more nucleotides 
which results in the substitution of one or more 
selected amino acid residues in said precursor ETR 
protein with a different amino acid, said selected 
amino acid residu is equivalent to an amino acxd 
residue selected from the group consisting of Ala-31, 



25 



30 
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Ile-62, Cys-65 and Ala-102 in the ETR protein from 
Arabidopsis thaliana. 

14. A plant according to Claim 12 wherein a tissue- 
specific promoter is operably linked to said modified 

5 ETR nucleic acid. 

15. A plant according to Claim 14 wherein said plant 
is fruit-bearing and said promoter comprises a fruit- 
specific promoter. 



10 



15 



16. A plant according to Claim 15 wherein said 
phenotype is characterized by a decrease in the rate of 
fruit ripening. 

17. Fruit from the plant according to Claim 16. 

18. The fruit according to Claim 18 comprising tomato. 

19 . A method for producing a plant having at least one 
transformed plant cell and a phenotype characterized by 
a decrease in the response of said at least one 
transformed plant cell to ethylene as compared to a 
plant not containing said transformed plant cell, said 
method comprising the steps of: 

®) transforming at least one plant cell with a 
modified ETR nucleic acid; 

b) regenerating plants from one or more of the 
thus transformed plant cells; and 

c) selecting at least one plant having said 
25 phenotype. 

20. A method according to Claim 19 wherein said 
modified ETR nucleic acid is operably linked to a 
tissue-specific promoter. 
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AAAGATAGTA TTTGTTGATA AATATGGGGA TATTTATCCT ATATTATCTG 50 

TATTTTTCTT ACCATTTTTA CTCTATTCCT TTATCTACAT TACGTCATTA 100 

CACTATCATA AGATATTTGA ATGAACAAAT TCATGCACCC ACCAGCTATA 150 

TTACCCTTTT TTATTAAAAA AAAACATCTG ATAATAATAA CAAAAAAATT 200 

AGAGAAATGA CGTCGAAAAA AAAAGTAAGA ACGAAGAAGA AGTGTTAAAC 250 

CCAACCAATT TTGACTTGAA AAAAAGCTTC AACGCTCCCC TTTTCTCCTT 300 

CTCCGTCGCT CTCCGCCGCG TCCCAAATCC CCAATTCCTC CTCTTCTCCG 350 

ATCAATTCTT CCCAAGTAAG CTTCTTCTTC CTCGATTCTC TCCTCAGATT 4 00 

GTTTCGTGAC TTCTTTATAT ATATTCTTCA CTTCCACAGT TTTCTTCTGT 450 

TGTTGTCGTC GATCTCAAAT CATAGAGATT GATTAACCTA ATTGGTCTTT 500 

ATCTAGTGTA ATGCATCGTT ATTAGGAACT TTAAATTAAG ATTTAATCGT 550 

TAATTTCATG ATTCGGATTC GAATTTTACT GTTCTCGAGA CTGAAATATG 600 

CAACCTATTT TTTCGTAATC GTTGTGATCG AATTCGATTC TTCAGAATTT 650 

ATAGCAATTT TGATGCTCAT GATCTGTCTA CGCTACGTTC TCGTCGTAAA 700 

TCGAAGTTGA TAATGCTATG TGTTTGTTAC ACAGGTGTGT GTATGTGTGA 750 

GAGAGGAACT ATAGTGTAAA AAATTCATAA TGGAAGTCTG CAATTGTATT 800 

GAACCGCAAT GGCCAGCGGA TGAATTGTTA ATGAAATACC AATACATCTC 850 

CGATTTCTTC ATTGCGATTG CGTATTTTTC GATTCCTCTT GAGTTGATTT 900 

ACTTTGTGAA GAAATCAGCC GTGTTTCCGT ATAGATGGGT ACTTGTTCAG 950 

TTTGGTGCTT TTATCGTTCT TTGTGGAGCA ACTCATCTTA TTAACTTATG 1000 

GACTTTCACT ACGCATTCGA GAACCGTGGC GCTTGTGATG ACTACCGCGA 1050 

AGGTGTTAAC CGCTGTTGTC TCGTGTGCTA CTGCGTTGAT GCTTGTTCAT 1100 

ATTATTCCTG ATCTTTTGAG TGTTAAGACT CGGGAGCTTT TCTTGAAAAA 1150 

TAAAGCTGCT GAGCTCGATA GAGAAATGGG ATTGATTCGA ACTCAGGAAG 1200 

AAACCGGAAG GCATGTGAGA ATGTTGACTC ATGAGATTAG AAGCACTTTA 1250 

GATAGACATA CTATTTTAAA GACTACACTT GTTGAGCTTG GTAGGACATT 1300 

AGCTTTGGAG GAGTGTGCAT TGTGGATGCC TACTAGAACT GGGTTAGAGC 1350 

TACAGCTTTC TTATACACTT CGTCATCAAC ATCCCGTGGA GTATACGGTT 1400 

CCTATTCAAT TACCGGTGAT TAACCAAGTG TTTGGTACTA GTAGGGCTGT 1450 

AAAAATATCT CCTAATTCTC CTGTGGCTAG GTTGAGACCT ' GTTTCTGGGA 1500 

AATATATGCT AGGGGAGGTG GTCGCTGTGA GGGTTCCGCT TCTCCACCTT 1550 
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TCTAATTTTC AGATTAATGA CTGGCCTGAG CTTTCAACAA AGAGATATGC 1600 

TTTGATGGTT TTGATGCTTC CTTCAGATAG TGCAAGGCAA TGGCATGTCC 1650 

ATGAGTTGGA ACTCGTTGAA GTCGTCGCTG ATCAGGTTTT ACATTGCTGA 1700 

GAATTTCTCT TCTTTGCTAT GTTCATGATC TTGTCTATAA CTTTTCTTCT 1750 

CTTATTATAG GTGGCTGTAG CTCTCTCACA TGCTGCGATC CTAGAAGAGT 1800 

CGATGCGAGC TAGGGACCTT CTCATGGAGC AGAATGTTGC TCTTGATCTA 1850 

GCTAGACGAG AAGCAGAAAC AGCAATCCGT GCCCGCAATG ATTTCCTAGC 1900 

GGTTATGAAC CATGAAATGC GAACACCGAT GCATGCGATT ATTGCACTCT 1950 

CTTCCTTACT CCAAGAAACG GAACTAACCC CTGAACAAAG ACTGATGGTG 2000 

GAAACAATAC TTAAAAGTAG TAACCTTTTG GCAACTTTGA TGAATGATGT 2050 

CTTAGATCTT TCAAGGTTAG AAGATGGAAG TCTTCAACTT GAACTTGGGA 2100 

CATTCAATCT TCATACATTA TTTAGAGAGG TAACTTTTGA ACAGCTCTAT 2150 

GTTTCATAAG TTTATACTAT TTGTGTACTT GATTGTCATA TTGAATCTTG 2200 

TTGCAGGTCC TCAATCTGAT AAAGCCTATA GCGGTTGTTA AGAAATTACC 2250 

CATCACACTA AATCTTGCAC CAGATTTGCC AGAATTTGTT GTTGGGGATG 2300 

AGAAACGGCT AATGCAGATA ATATTAAATA TAGTTGGTAA TGCTGTGAAA 2350 

TTCTCCAAAC AAGGTAGTAT CTCCGTAACC GCTCTTGTCA CCAAGTCAGA 2400 

CACACGAGCT GCTGACTTTT TTGTCGTGCC AACTGGGAGT CATTTCTACT 2450 
TGAGAGTGAA GGTTATTATC TTGTATCTTG GGATCTTATA CCATAGCTGA 2500 
AAGTATTTCT TAGGTCTTAA TTTTGATGAT TATTCAAATA TAGGTAAAAG 2550 
ACTCTGGAGC AGGAATAAAT CCTCAAGACA TTCCAAAGAT TTTCACTAAA 2600 
TTTGCTCAAA CACAATCTTT AGCGACGAGA AGCTCGGGTG GTAGTGGGCT 2650 
TGGCCTCGCC ATCTCCAAGA GGTTTGAGCC TTATTAAAAG ACGTTTTTTT 2700 
CCAACTTTTT CTTGTCTTCT GTGTTGTTAA AAGTTTACTC. ATAAGCGTTT 2750 
AATATGACAA GGTTTGTGAA TCTGATGGAG GGTAACATTT GGATTGAGAG 2800 
CGATGGTCTT GGAAAAGGAT GCACGGCTAT CTTTGATGTT AAACTTGGGA 2850 
TCTCAGAACG TTCAAACGAA TCTAAACAGT CGGGCATACC GAAAGTTCCA 2900 
GCCATTCCCC GACATTCAAA TTTCACTGGA CTTAAGGTTC TTGTCATGGA 2950 
TGAGAACGGG TTAGTATAAG CTTCTCACCT TTCTCTTTGC AAAATCTCTC 3000 
GCCTTACTTC TTGCAAATGC AGATATTGGC GTTTAGAAAA AACGCAAATT 3050 
TAATCTTATG AGAAACCGAT GATTATTTTG GTTGCAGGGT AAGTAGAATG 3100 

FIG. 2B 
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GTGACGAAGG GACTTCTTGT ACACCTTGGG TGCGAAGTGA CCACGGTGAG 
TTCAAACGAG GAGTGTCTCC GAGTTGTGTC CCATGAGCAC AAAGTGGTCT 
TCATGGACGT GTGCATGCCC GGGGTCGAAA ACTACCAAAT CGCTCTCCGT 
ATTCACGAGA AATTCACAAA ACAACGCCAC CAACGGCCAC TACTTGTGGC 
ACTCAGTGGT AACACTGACA AATCCACAAA AGAGAAATGC ATGAGCTTTG 
GTCTAGACGG TGTGTTGCTC AAACCCGTAT CACTAGACAA CATAAGAGAT 
GTTCTGTCTG ATCTTCTCGA GCCCCGGGTA CTGTACGAGG GCATGTAAAG 
GCGATGGATG CCCCATGCCC CAGAGGAGTA ATTCCGCTCC CGCCTTCTTC 
TCCCGTAAAA CATCGGAAGC TGATGTTCTC TGGTTTAATT GTGTACATAT 
CAGAGATTGT CGGAGCGTTT TGGATGATAT CTTAAAACAG AAAGGGAATA 
ACAAAATAGA AACTCTAAAC CGGTATGTGT CCGTGGCGAT TTCGGTTATA 
GAGGAACAAG ATGGTGGTGG TATAATCATA CCATTTCAGA TTACATGTTT 
GACTAATGTT GTATCCTTAT ATATGTAGTT ACATTCTTAT AAGAATTTGG 
ATCGAGTTAT GGATGCTTGT TGCGTGCATG TATGACATTG ATGCAGTATT 
ATGGCGTCAG CTTTGCGCCG CTTAGTAGAA CAACAACAAT GGCGTTACTT 
AGTTTCTCAA TCAACCCGAT CTCCAAAAC 
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AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA 50 
AAGCTTCAAC GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC 100 
CAAATCCCCA ATTCCTCCTC TTCTCCGATC AATTCTTCCC AAGTGTGTGT 150 
ATGTGTGAGA GAGGAACTAT AGTGTAAAAA ATTCATA ATG GAA GTC TGC 199 

1 

AST TCT ATT GAA CCG CAA TGG CCA GCG GAT GAA TTG TTA ATG 241 
its lyl lie GlS Pro GlK Trp Pro Ala Asp Glu Leu Leu Met 

5 10 

AAA TAG CAA TAG ATC TCC GAT TTC TTC ATT GCG ATT GCG TAT 283 
Lys T^r Gin Tyr lie Ser As| Phe Phe lie Ala lie Axa lyr 

TTT TCG ATT CCT CTT GAG TTG ATT TAG TTT GTG AAG AAA TCA 325 
III ie? lie Pro lei Gin Leu lie Tyr Phe Val Lys L^| Ser 
35 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT 367 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Aia 
50 

TTT ATC GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG 409 
Phi lis vll III c|| Gly Ala Thr His Leu He Asn Leu Trp 

ACT TTC ACT ACG CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT 451 
Thr Phe Thr Thr His Ser Arg Thr Val Ala Leu vax nee inr 
75 80 °^ 

ACC GCG AAG GTG TTA ACC GCT GTT GTC TCG TGT GCT ACT GCG 493 
Thr Ala Lys Val Leu Thr Ala Val Val Ser cys Axa inr 
90 95 

■vrr ATC CTT GTT CAT ATT ATT CCT GAT CTT TTG AGT GTT AAG 535 
lln ult III vll Hil 111 lie Pro Asp Leu Leu Ser Val Lys 
105 

Its £| III ^.^i^^ fsi ^" 

m its ^It ii n| tfe g| ^11 if. It its tSS if? ^1 

§i! ifS Sil S tg? gti its tfe L=t i tS lis 

145 150 



^ tS til lit fl tSt its Zl ifv m 

160 

ACA TTA GCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA 
Thr Leu Ala Leu Glu Glu cys aibl L.eu ii-f 



160 165 

GCT TTG GAG GAG TGT 
Ala Leu Glu Glu Cys 
175 180 
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ACT GGG TTA GAG CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA 787 

Thr Gly Leu Glu Leu Gin Leu Ser Tvr Thr Leu Arq His Gin 
190 195 200 

CAT CCC GTG GAG TAT ACG GTT CCT ATT CAA TTA CCG GTG ATT 829 
His Pro Val Glu Tvr Thr Val Pro He Gin Leu Pro Val He 

205 210 

AAC CAA GTG TTT GGT ACT AGT AGG GCT GTA AAA ATA TCT CCT 871 
Asn Gin Val Phe Gly Thr Ser Arg Ala Val Lys lie Ser Pro 
215 220 225 

AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT TCT GGG AAA TAT 913 
Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly Lys Tvr 
230 235 240 

ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT CTC CAC 955 
Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT 1039 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser 

275 280 

GCA AGG CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC 1081 
Ala Arg Gin Trp His Val His Glu Leu Glu Leu Val Glu Val 
.285 290 295 

GTC GCT GAT CAG GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC 1123 
Val Ala Asp Gin Val Ala Val Ala Leu Ser His Ala Ala He 
300 305 310 

CTA GAA GAG TCG ATG CGA GCT AGG GAC CTT CTC ATG GAG CAG 1165 
Leu Glu Glu Ser Met Arg Ala Arg Asp Leu Leu Met Glu Gin 
315 320 325 

AAT GTT GCT CTT GAT CTA GCT AGA CGA GAA GCA GAA ACA GCA 1207 
Asn Val Ala Leu Asp Leu Ala Arg Arg Glu Ala Glu Thr Ala 
330 335 340 

ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT ATG AAC CAT GAA 1249 
He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu 

345 350 

ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT TCC TTA 1291 
Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 
355 360 365 

CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 
Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG 1375 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met 
385 390 395 
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AAT GAT GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT 1417 
Asn Asp Val Leu Asd Leu Ser Arg Leu Glu Asp Gly Ser Leu 
400 " 405 410 

CAA CTT GAA CTT GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA 1459 
Gin Leu Glu Leu Gly Thr Phe Asn Leu His Thr Leu Phe Arg 

415 • 420 

GAG GTC CTC AAT CTG ATA AAG CCT ATA GCG GTT GTT AAG AAA 1501 
Glu Val Leu Asn Leu lie Lys Pro lie Ala Val Val Lys Lys 
425 • 430 435 

TTA CCC ATC ACA CTA AAT CTT GCA CCA GAT TTG CCA GAA TTT 1543 
Leu Pro lie Thr Leu Asn Leu Ala Pro Asp Leu Pro Glu Phe 
440 445 450 

GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG ATA ATA TTA AAT 1585 
Val Val Gly Asp Glu Lys Arg Leu Met Gin lie lie Leu Asn 
455 460 465 

ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT AGT ATC 1621 
lie Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser lie 
470 475 480 

TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 

485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTG AGA 1711 

Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg 

495 500 505 

GTG AAG GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC 1753 

Val Lys Val Lys Asp Ser Gly Ala Gly He Asn Pro Gin Asp 
510 515 520 

ATT CCA AAG ATT TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA 1795 



lie Pro Lys He Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu 
')25 530 535 



GCG ACG AGA AGC TCG GGT GGT AGT GGG CTT GGC CTC GCC ATC 1837 
Ala Thr Arg Ser Ser Gly Gly Ser Gly Leu Gly Leu Ala lie 
540 545 550 

TCC AAG AGG TTT GTG AAT CTG ATG GAG GGT AAC ATT TGG ATT 1879 
Ser Lys Arg Phe Val Asn Leu Met Glu Gly Asn He Trp He 

555 560 

GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG GCT ATC TTT GAT 1921 
Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He Phe Asp 
565 570 575 

GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT AAA CAG 1963 
Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
595 600 605 
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Phi Th^ r^^ F- ^-"^ ^'^T ATG GAT GAG AAC GGG GTA 2047 

Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val 
610 615 620 

iS^^? ?r'^9 ^'^^ CTT CTT GTA CAC CTT GGG TGC 2089 

Ser Arg Met Val Thr Lys Gly Leu Leu Val His Leu Glv Cvs 
.625 630 ^ 

^ ^^"^ AGT TCA AAC GAG GAG TGT CTC CGA GTT 2131 

Glu Val Thr Thr Val Ser Ser Asn Glu Glu Cys Leu Aro Val 
635 640 645 

GTG TCC CAT GAG CAC AAA GTG GTC TTC ATG GAC GTG TGC ATG 2173 
Val Ser His Glu His Lys Val Val Phe Met Asp Val Cvs Met 
650 655 660 

SS£ 9^ TAG CAA ATC GCT CTC CGT ATT CAC GAG 2215 

Pro Gly Val Glu Asn Tyr Gin He Ala Leu Arg He His Glu 
665 670 675 

^ 11^ CGC CAC CAA CGG CCA CTA CTT GTG GCA 2257 

Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu Val Ala 
680 685 690 

c^"^ ^^"^ ^ TCC ACA AAA GAG AAA TGC ATG 2299 

Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 

695 700 

AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 2341 
Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
'05 710 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG 2383 
Asp Asn He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg 
720 725 730 

GTA CTG TAG GAG GGC ATG TAAAGGCGAT GGATGCCCCA 2421 
Val Leu Tjr Glu Gly Met 

TGCCCCAGAG GAGTAATTCC GCTCCCGCCT TCTTCTCCCG TAAAACATCG 2471 

GAAGCTGATG TTCTCTGGTT TAATTGTGTA CATATCAGAG ATTGTCGGAG 2521 

CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA ATAGAAACTC 2571 

TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 

GGTGGTATAA TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC 2671 

CTTATATATG TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG 2721 

CTTGTTGCGT GCATGTATGA CATTGATGCA GTATTATGGC GTCAGCTTTG 2771 

CGCCGCTTAG TAGAAC 2787 
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AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA 50 
AAGCTTCAAC GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC 100 
CAAATCCCCA ATTCCTCCTC TTCTCCGATC AATTCTTCCC AAGTGTGTGT 150 
ATGTGTGAGA GAGGAACTAT AGTGTAAAAA ATTCATA ATG GAA GTC TGC 199 

1 

TGT ATT GAA CCG CAA TGG CCA GCG GAT GAA TTG TTA ATG 241 
A^n Cys IlS GlS Pro 5lS Trp Pro Ala Asp Glu Leu Leu Met 
- 10 



5 



AAA TAC CAA TAC ATC TCC GAT TTC TTC ATT GCG ATT GTG TAT 283 
Lys T^r Gin Tyr He Ser As| Phe Phe He Ala lie Val Tyr 

TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 
III sir lie Pro Lei GlS Leu lie Tyr Phe Val Lys Lys Ser 

35 40 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT 367 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala 
=in 55 



TTT ATC GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG 409 
Phi Ili vll iZ c|| Gly Ala Thr His Leu He Asn Leu Trp 

ACT TTC ACT ACG CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT 451 
Thr Phe Thr Thr His Ser Arg Thr Val Ala Leu Val Met Tnr 
75 80 03 , 

ACC GCG AAG GTG TTA ACC GCT GTT GTC TCG TGT GCT ACT GCG 493 
Thr Ala Lys Val Leu Thr Ala Val Val Ser Cys Ala Thr Ala 
90 95 100 

TTr ATr; TTT GTT CAT ATT ATT CCT GAT CTT TTG AGT GTT AAG 535 
III ult lH vll Hil Il4 Ili Pro ASP Leu Leu Ser Val Lys 
105 110 

ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT GCT GAG CTC GAT 577 
Thr Arg Glu Leu Phe Leu Lys Asn L^s Ala Ala Glu Leu Asg 



AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC GGA AGG 619 
Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg 
^ 135 140 

CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 1^^ 

ara raT arT btt TTA AAG ACT ACA CTT GTT GAG CTT GGT AGG 703 
^ Hii Th? Ill lln L^f Thr Thr Leu Val Glu Leu Gly Arg 



160 



TTa rPT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA 745 
Th^ lln All Hi GlS GlS Cyl Ala Leu Trp Met Pro Thr Arg 
175 180 -1-03 
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ACT GGG TTA GAG CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA 787 
Thr Gly Leu Glu Leu Gin Leu Ser Tvr Thr Leu Arg His Gin 
190 195 200 

CAT CCC GTG GAG TAT ACG GTT CCT ATT CAA TTA CCG GTG ATT 829 
His Pro Val Glu Tyr Thr Vai Pro He Gin Leu Pro Val lie 

205 210 

AAC CAA GTG TTT GGT ACT AGT AGG GCT GTA AAA ATA TCT CCT 871 
Asn Gin Val Phe Gly Thr Ser Arg Ala Val Lys lie Ser Pro 
215 220 225 

AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT TCT GGG AAA TAT 917 
Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly Lys Tvr 
230 235 240 

ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT CTC CAC 955 
Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT 1039 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser 

275 280 

GCA AGG CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC 1081 
Ala Arg Gin Trp His Val His Glu Leu Glu Leu Val Glu Val 
285 290 295 

GTC GCT GAT CAG GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC 1123 

Ala Val Ala Leu Ser His Ala Ala lie 
300 305 310 

CTA GAA GAG TCG ATG CGA GCT AGG GAC CTT CTC ATG GAG CAG 1165 
Leu Glu Glu Ser Met Arg Ala Arg Asp Leu Leu Met Glu Gin 
315 320 325 

AAT GTT GCT CTT GAT CTA GCT AGA CGA GAA GCA GAA ACA GCA 1207 
Asn Val Ala Leu Asp Leu Ala Arg Arg Glu Ala Glu Thr Ala 
330 335 340 

ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT ATG AAC CAT GAA 1249 
He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu 

345 350 

ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT TCC TTA 1291 
Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 
355 360 365 

CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 
Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG 1375 
Glu Thr lie Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met 
385 390 395 
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AAT GAT GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT 1417 
Asn Asp Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu 
400 

CAA CTT GAA CTT GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA 1459 
Gin Leu Glu Leu Gly Thr Phe Asn Leu His Thr Leu Phe Arg 

415 420 

GAG GTC CTC AAT CTG ATA AAG CCT ATA GCG GTT GTT AAG AAA 1501 

Glu Val Leu Asn Leu He Lys Pro He Ala Val Val Lys Lys 
425 430 435 

TTA err ATC ACA CTA AAT CTT GCA CCA GAT TTG CCA GAA TTT 1543 

Uu Pro lie Thr leu Leu Ala Pro Asp Leu Pro Glu Phe 

440 445 "^^^ 

GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG ATA ATA TTA AAT 1585 
val Val Gl^ Asp Glu Lys Arg Leu Met Gin He He Leu Asn 

ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT AGT ATC 1627 
He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser He 

TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Aia 

485 4yu 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAG TTG AGA 1711 
aSp Phi Phi vll vll PrS Thr Gly Ser His Phe Tyr Leu Arg 
493 500 505 

GTG AAG GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC 1753 
Val Lys Val Lys Asp Ser Gl^ Ala Gly He Asn Pro Gin Asp 

ATT CCA AAG ATT TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA 1795 
He Pro Lys He Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu 
525 530 -'■5-' 

GCG ACG AGA AGC TCG GGT GGT AGT GGG CTT GGC CTC GCC ATC 1837 
Ala Thr Arg |er Ser Gly Gly Ser Gl^ Leu Gly Leu Ala lie 

TCC AAG AGG TTT GTG AAT CTG ATG GAG GGT AAC ATT TGG ATT 1879 
Ser Lys Arg Phe Val Asn Leu Met Glu Gl^ Asn lie irp lie 

GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG GCT ATC TTT GAT 1921 
Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He Phe Asp 
565 570 

GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT AAA CAG 1963 
Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
5B0 585 

TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
595 600 ^^-^ 
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Phe Thl Clt ?r ?r'^'^ GAT GAG AAC GGG GTA 2047 

Pne Thr Gly Leu i.ys Val Leu Val Met Asp Glu Asn Gly Val 
o-i-u 615 620 

^ft SI? I? III ni §it gti Ei: m i^s 

^ SI? "TCA AAC GAG GAG TGT CTC CGA GTT 21 ?1 

635 ^^"^ IIS Glu Leu Arg Val 

vl? S^*^ CAC AAA GTG GTC TTC ATG GAG GTG TGC ATG 21 7"^ 

llo Y^l Val Cys Met 



660 



CCC GGG GTC GAA AAC TAG CAA ATC GCT CTC CGT ATT CAr rar "5-51 = 
Pro Gly Val Glu Asn Tyr Gin He Ala Leu III His llS 

670 675 

tS^ ^ CAC CAA CGG CCA CTA CTT GTG GCA 2257 
Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu Val Ala 
680 685 690 

?Jn r?^ ^G ACT GAC AAA TCC ACA AAA GAG AAA TGC ATG 229^ 
Leu Ser Gly Asn Thr Asp Lys Ser Thr L^s Glu Ly^ cys Met 

AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA rra on/,, 

Ser Phe Gly Leu Asp Glj Val Leu LeS L?? Pro vll ler lIu ^^^^ 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG Crr err o-io-i 

Asp Ajn He Arg Asp Val Leu Ser Asp Lei Leu GlS Pro p3g ^^^^ 
'^^ 725 730 

vl? Ill Ifr ifS SI? ^^GGCGAT GGATGCCCCA 2421 
735 

TGCCCCAGAG GAGTAATTCC GCTCCCGCCT TCTTCTCCCG TAAAACATCG 2471 

GAAGCTGATG TTCTCTGGTT TAATTGTGTA CATATCAGAG ATTGTCGGAG 2521 

CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA ATAGAAACTC 2571 

TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 

GGTGGTATAA TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC 2671 

CTTATATATG TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG 2721 

CTTGTTGCGT GCATGTATGA CATTGATGCA GTATTATGGC GTCAGCTTTG 2771 

CGCCGCTTAG TAGAAC 2787 
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AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA 50 
AAGCTTCAAC GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC 100 
CAAATCCCCA ATTCCTCCTC TTCTCCGATC AATTCTTCCC AAGTGTGTGT 150 
ATGTGTGAGA GAGGAACTAT AGTGTAAAAA ATTCATA ATG GAA GTC TGC 199 

1 

AAT TGT ATT GAA CCG CAA TGG CCA GCG GAT GAA TTG TTA ATG 241 
Asn Cys He Glu Pro Gin Trp Pro Ala Asp Glu Leu Leu Met 
5 ^ 10 lb 

AAA TAC CAA TAG ATC TCC GAT TTC TTC ATT GCG ATT GCG TAT 283 
Lys Tyr Gin Tyr He Ser Asp Phe Phe He Ala lie Ala Tyr 



io 21 

TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 

Phe Se? Ill Pro Lei Glu Leu lie Tyr Phe Val Lys Lys Ser 
35 40 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT 367 

Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Giy Aia 

50 55 



TTT TTC GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG 
Phi Phe vll III c|| Gly Ala Thr His Leu He Asn Leu Trp 



409 



ACT TTC ACT ACG CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT 451 
Thr Phe Thr Thr His Ser Arg Thr Val Ala Leu Val Met Tnr 
75 80 °^ 

ACC GCG AAG GTG TTA ACC GCT GTT GTC TCG TGT GCT ACT GCG 493 
Thr Ala Lys Val Leu Thr Ala Val Val Ser Cys Ala Tnr Axa 
90 95 

„„_ p„™ CAT nrpT ATT CCT GAT CTT TTG AGT GTT AAG 535 

Hi nit lH val His 111 Il5 Pro ASP Leu Leu Ser Val Lys 
105 

ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT GCT GAG CTC GAT 577 
Th? A?g Glu Leu Phi Leu Lys Asn L^| Ala Ala Glu Leu Asg 

AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC GGA AGG 619 
Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu xnr i^xy Arg 

X 3 3 

CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
His vll A?t wit III Thr His Glu He Arg Ser Thr Leu Asp 
145 150 15b 

lit fi fS^ III fal its ifv ^^il 

160 

ACA TTA GCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA 745 
Thr Leu Ala Leu Glu Glu Cys Ala Leu Trp Met Fro inr Arg 
175 loU 
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thI r?S ^"^^ ^CT TAT ACA CTT CGT CAT CAA 787 

Thr Gly Leu Glu Leu Gin Leu Ser Tyr Thr Leu Arg His Gin 
190 i§5 200 

CAT CCC GTG GAG TAT ACG GTT CCT ATT CAA T^A CCG GTG ATT aoa 
His Pro Val Glu Tvr Thr Val Pro He Glri Le2 Pro vll III 

205 210 



AAC CAA GTG TTT GGT ACT AGT AGG GCT GTA AAA ATA TCT CCT 071 
Asn Gin Val Phe Gly Thr Ser Arg Ala Val Lys lie Ser Pro 
^■••■3 220 225 

AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT TCT GGG AAA TAT Ql 7 
Asn ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly ?yr 
•^Jw 235 240 

ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT CTC CAC 95 S 
Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 

250 * * 



255 



CTT TCT. AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
260 265 270 

l^"^ ^^"^ TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT 1 mo 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser 

275 280 

GCA AGG CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC lOftl 
Al| Arg Gin Trp His Val His Glu Leu Glu L|u Val Glu Val 

§If f?! ^9'^ GTA GCT CTC TCA CAT GCT GCG ATC ■ 1123 

Val Ala Asp Gin Val Ala Val Ala Leu Ser His Ala Ala lie 

305 

CTA GAA GAG TCG ATG CGA GCT AGG GAC CTT CTC ATG GAG CAG llfiS 
Leu Glu Glu Ser Met Arg Ala Arg Asp Leu Leu Met Glu Gin " 



325 



vIT ?TT GAT CTA GCT AGA CGA GAA GCA GAA ACA GCA 1207 

Asn Val Ala Leu Asp Leu Ala Arg Arg Glu Ala Glu Thr Al¥ 
330 335 340 

t7S F*"^ n?^ ^T GAT TTC CTA GCG GTT ATG AAC CAT GAA 1249 

He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu 

345 350 

?^ mE^ ^TG CAT GCG ATT ATT GCA CTC TCT TCC TTA 1291 

Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 

360 365 

CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1331 
Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
•''O 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG 1375 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met 
- 385 390 395 
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AAT GAT GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT 1417 
Asn Asp Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu 
400 405 

CAA CTT GAA CTT GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA 1459 
Gin Leu Glu Leu Gly Thr Phe Asn Leu His Thr Leu Phe Arg 

415 420 

GAG GTC CTC AAT CTG ATA AAG CCT ATA GCG GTT GTT AAG AAA 1501 

Glu Val Leu Asn Leu He Lys Pro He Ala Val Val Lys Lys 
425 430 435 

TTA CCC ATC ACA CTA AAT CTT GCA CCA GAT TTG CCA GAA TTT 1543 

Leu Pro He Thr Leu Asn Leu Ala Pro Asp Leu Pro Glu Pne 
440 445 '^^^ 

GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG ATA ATA TTA AAT 1585 
Val Val Glv Asp Glu Lys Arg Leu Met Gin He He Leu Asn 
455 460 ''o^ 

ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT AGT ATC 1627 
He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser lie 
470 475 

TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Se? vSl ThS Ala L^i Val Thr Lys Ser Asp Thr Arg Ala Ala 

485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTG AGA 1711 
Iti Phe III vll Val Pro Thr Gly Ser His Phe Tyr Leu Arg 
495 500 505 

CTd AAG GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC 1753 
vll L^S vll L?l Sp sir GlV Ala Gly He Asn Pro Gin Asp 
510 515 ^-'^ 

ATT CCA AAG ATT TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA 1795 
He Pro Lys He Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu 
525 530 

GCG ACG AGA AGC TCG GGT GGT AGT GGG CTT GGC CTC GCC ATC 1837 
Ala Thr Arg |er Ser Gly Gly Ser Gl^ Leu Gly Leu Ala lie 

TCC AAG AGG TTT GTG AAT CTG ATG GAG GGT AAC ATT TGG ATT 1879 
Ser Lys Arg Phe Val Asn Leu Met Glu Glv Asn lie Trp lie 

555 ^^'^ 

GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG GCT ATC TTT GAT 1921 
Glu Ser Asp Gly Leu Glj Lys Gly Cys Thr Ala He Phe Asp 

GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT AAA CAG 1963 
vll l5¥ III Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 



TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 
Ser Gly He Pro Lys Val Pro Ala He Pro Arg His ser Asn 

600 



2005 
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DhS GTC ATG GAT GAG AAC GGG GTA 2047 

Phe Thr Gly ^eu Lys Val Leu Val Met Aso Giu Asn Gly Val 
610 615 620 

AGT AGA ATG GTG ACG AAG GGA CTT CTT GTA CAC CTT GGG TGC 208 9 
Ser Arg Met Val Thr Lys Gly Leu Leu Val His Leu Glv Cvs 

625 630 ^ 

^ "^CA AAC GAG GAG TGT CTC CGA GTT 2131 

Glu Val Thr Thr Val Ser Ser Asn Glu Glu Cys Leu Arg Val 
dJ5 640 645 

GTG TCC CAT GAG CAC AAA GTG GTC TTC ATG GAC GTG TGC ATG 217^ 
Val Ser His Glu His Lys Val Val Phe Met Asp Val Cys Met 
650 655 660 " 

CCC GGG GTC GAA AAC TAC CAA ATC GCT CTC CGT ATT CAC GAG 2215 
Pro Gly Val Glu Asn Tyr Gin He Ala Leu Arg He His Glu 
665 670 675 

^ ACA AAA CAA CGC CAC CAA CGG CCA CTA CTT GTG GCA 2257 

Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu Val Ala 
680 685 690 

CTC AGT GGT AAC ACT GAC AAA TCC ACA AAA GAG AAA TGC ATG 2299 
Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu LyV Cys Met 

695 700 

AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA - 2341 

Ser Phe Gly Leu Asp Glv Val Leu Leu Lys Pro Val SeV Leu 
'Uj 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG - 2383 

Asp Asn He Arg Asp Val Leu Ser Asp Leu Leu Glu Pro Arg 
'20 725 730 

GTA CTG TAC GAG GGC ATG TAAAGGCGAT GGATGCCCCA 2421 
Val Leu T^r Glu Gly Met 

TGCCCCAGAG GAGTAATTCC GCTCCCGCCT TCTTCTCCCG TAAAACATCG 2471 

GAAGCTGATG TTCTCTGGTT TAATTGTGTA CATATCAGAG ATTGTCGGAG 2521 

CGTTTTGGAT GATATCTTAA AACAGAAAGG GAATAACAAA ATAGAAACTC 2571 

TAAACCGGTA TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT 2621 

GGTGGTATAA TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC 2671 

CTTATATATG TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG 2721 

CTTGTTGCGT GCATGTATGA CATTGATGCA GTATTATGGC GTCAGCTTTG . 2771 

CGCCGCTTAG TAGAAC 2787 
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AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA 50 
AAGCTTCAAC GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC 100 
CAAATCCCCA ATTCCTCCTC TTCTCCGATC AATTCTTCCC AAGTGTGTGT 150 
ATGTGTGAGA GAGGAACTAT AGTGTAAAAA ATTCATA ATG GAA GTC TGC 199 

1 

AAT TGT ATT GAA CCG CAA TGG CCA GCG GAT GAA TTG TTA ATG 241 
Asn Cys lie Glu Pro Gin Trp Pro Ala Asp Glu Leu Leu Met 
5 10 15 

AAA TAC CAA TAG ATC TCC GAT TTC TTC ATT GCG ATT GCG TAT 283 
Lys T^r Gin Tyr He Ser As| Phe Phe He Ala lie Ala Tyr 

TTT TCG ATT CCT CTT GAG TTG ATT TAC TTT GTG AAG AAA TCA 325 
Phi sir Ili Pro Leu Glu Leu He Tyr Phe Val Lys Lys Ser 

35 40 

GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT 367 
Ala Val Phe Pro Tyr Arg Trp Val L|u Val Gin Phe Gly Ala 



50 



TTT ATC GTT CTT TAT GGA GCA ACT CAT CTT ATT AAC TTA TGG 
Phe lie val Liu T^r Gly Ala Thr His Leu He Asn Leu Trp 



409 



ACT TTC ACT ACG CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT 451 
Thr Phe Thr Thr His Ser Arg Thr Val Ala Leu Val Met Thr 
75 80 °^ 

ACC GCG AAG GTG TTA ACC GCT GTT GTC TCG TGT GCT ACT GCG 493 
Thr Ala Lys Val Leu Thr Ala Val Val Ser Cys Ala Thr Ala 



95 100 

TTG ATG CTT GTT CAT ATT ATT CCT GAT CTT TTG AGT GTT AAG 535 
Leu Met Leu Val His He He Pro Asp Leu Leu Ser Val Lys 
105 110 

ACT CGG GAG CTT TTC TTG AAA AAT AAA GCT GCT GAG CTC GAT 577 
Thr Arg Glu Leu Phe Leu Lys Asn L^| Ala Ala Glu Leu Asg 

AGA GAA ATG GGA TTG ATT CGA ACT CAG GAA GAA ACC GGA AGG 619 
Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Giy Arg 
^ 135 140 

CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 IS^* 

AGA CAT ACT ATT TTA AAG ACT ACA CTT GTT GAG CTT GGT AGG 703 
Arg His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg 
160 165 J-'" 

ara tta arr TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA 745 
Uu All llu GlS GlS lyl Ala Leu Trp Met Pro Thr Arg 
175 180 
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ACT GGG TTA GAG CTA GAG CTT TCT TAT ACA CTT CGT CAT CAA 787 
Thr Gly Leu Glu Leu Gin Leu Ser Tyr Thr Leu Arq His Gin 
190 195 200 

CAT CCC GTG GAG TAT ACG GTT CCT ATT CAA TTA CCG GTG ATT 829 
His Pro Val Glu Tyr Thr Val Pro lie Gin Leu Pro Val He 

205 210 

AAC CAA GTG TTT GGT ACT AGT AGG GCT GTA AAA ATA TCT CCT 871 
Asn Gin Val Phe Gly Thr Ser Arg Ala Val Lys He Ser Pro 
215 220 225 

AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT TCT GGG AAA TAT 913 
Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly Lys Tvr 
230 235 240 

ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT CTC CAC 955 
Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT 1039 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asd Ser 

275 280 

GCA AGG CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC 1081 
Ala Arg Gin Trp His Val His Glu Leu Glu Leu Val Glu Val 
285 290 295 

GTC GCT GAT CAG GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC 1123 
Val Ala Asp Gin Val Ala Val Ala Leu Ser His Ala Ala He 
300 305 310 

CTA GAA GAG TCG ATG CGA GCT AGG GAC CTT CTC ATG GAG CAG 1165 
Leu Glu Glu Ser Met Arg Ala Arg Asp Leu Leu Met Glu Gin 
315 320 325 

AAT GTT GCT CTT GAT CTA GCT AGA CGA GAA GCA GAA ACA GCA 1207 
Asn Val Ala Leu Asp Leu Ala Arg Arg Glu Ala Glu Thr Ala 
330 335 340 

ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT ATG AAC CAT GAA 1249 
He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu 

345 350 

ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT TCC TTA 1291 

Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 

355 360 365 

CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 

Leu Gin Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG 1375 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu Met 
385 390 395 
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AAT GAT GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT 1417 

Asn Asp Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu 

^ 400 405 410 

CAA CTT GAA CTT GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA 1459 

Gin Leu Glu Leu Gly Thr Phe Asn Leu His Thr Leu Phe Arg 

415 420 



GAG GTC CTC AAT CTG ATA AAG CCT ATA GCG GTT GTT AAG AAA 1501 
Glu Val Leu Asn Leu He Lys Pro He Ala Val Val Lys Lys 
425 430 435 

TTA CCC ATC ACA CTA AAT CTT GCA CCA GAT TTG CCA GAA TTT 1543 
Leu Pro He Thr Leu Asn Leu Ala Pro Asp Leu Pro Glu Phe 
440 445 450 

GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG ATA ATA TTA AAT 1585 
Val Val Gly Asp Glu Lys Arg Leu Met Gin He He Leu Asn 
455 ^ ^ 

ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT AGT ATC 1627 
He Val Gly Asn Ala Val Lys Phe Ser Lys Gin Gly Ser He 
470 475 480 

TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 

485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTG AGA 1711 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg 
495 500 505 

GTG AAG GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC 1753 
Val Lys Val Lys Asp Ser Gly Ala Gly He Asn Pro Gin Asp 
510 515 520 

ATT CCA AAG ATT TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA 1795 
He Pro Lys He Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu 
525 530 535 

GCG ACG AGA AGC TCG GGT GGT AGT GGG CTT GGC CTC GCC ATC 1837 
Ala Thr Arg Ser Ser Gly Gly Ser Gly Leu Gly Leu Ala He 
540 545 550 

TCC AAG AGG TTT GTG AAT CTG ATG GAG GGT AAC ATT TGG ATT 1879 
Ser Lys Arg Phe Val Asn Leu Met Glu Gly Asn He Trp He 

555 560 

GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG GCT ATC TTT GAT 1921 
Glu Ser Asp Gly Leu Gly Lys Gly Cys Thr Ala He Phe Asp 
565 ^ ^ 570 575 

GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT AAA CAG 1963 
Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 590 

TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
Ser Gly He Pro Lys Val Pro Ala He Pro Arg His Ser Asn 
^ 595 600 605 
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pJS r'f^ CTT GTC ATG GAT' GAG AAC GGG GTA 204 7 

Phe Tnr Gly Leu Lys Val Leu Val Met Asp Glu Asn Gly Val 
o-LU 615 620 

^ ^It §If II if- til III -A CAC CTT gC3 TCC 209, 

625 630 

T^"^? ^'^G AGT TCA AAC GAG GAG TGT CTC CGA CTT 9T?i 

Glu Val Thr Thr Val Ser Ser Asn Glu Glu C^s Leu Arg vll ^^^^ 

GTG TCC CAT GAG CAC AAA GTG GTC TTC ATG GAC GTG Trr arr on-, 
Val Ser His Glu His Lys Val Val Phe Me? vlf J^f g^f 2173 

655 660 

SSS rf?, .^7^? ^ "^AC CAA ATC GCT CTC CGT ATT CAC GAG 

Pro Gly Val Glu Asn Tyr Gin lie Ala Leu Arg lie His 

Hi ^ 9^ CAA CGG CCA CTA CTT GTG GCA 2257 

Lys Phe Thr Lvs Gin Arg His Gin Arg Pro Leu Leu Val Ala 

685 

£IS m 11 i^s IS? £| l^s ^11 
I? Ill i!J £iS i!^ §1! ZIS S?? i?| §1? i- 

fAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG rrr rrr • o-so-s 
Asp Asn lie Arg Asp Val Leu Ser Asp Lei lIu GlS Pro A?g ^^^^ 
'^u 725 730 

vll Iv? GlS GlS Met ^^^^C^T GGATGCCCCA TGCCCCAGAG 2431 
735 

GAGTAATTCC GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG "" " 2481 

TTCTCTGGTT TAATTGTGTA CATATCAGAG ATTGTCGGAG CGTTTTGGAT 2531 

GATATCTTAA AACAGAAAGG GAATAACAAA ATAGAAACTC TAAACCGGTA 2581 

TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT GGTGGTATAA 2631 

TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC CTTATATATG 2681 

TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG CTTGTTGCGT 2731 

GCATGTATGA CATTGATGCA GTATTATGGC GTCAGCTTTG CGCCGCTTAG 2781 

TAGAAC 2787 



FIG. 6D 

SUBSTTTUTE SHEET (RULE 26) 



wo 95/01439 PCT/US94/07418 



21 / 65 

AGTAAGAACG AAGAAGAAGT GTTAAACCCA ACCAATTTTG ACTTGAAAAA 50 
AAGCTTCAAC GCTCCCCTTT TCTCCTTCTC CGTCGCTCTC CGCCGCGTCC 100 
CAAATCCCCA ATTCCTCCTC TTCTCCGATC AATTCTTCCC AAGTGTGTGT 150 
ATGTGTGAGA GAGGAACTAT AGTGTAAAAA ATTCATA ATG GAA GTC TGC 199 

1 

AAT TGT ATT GAA CCG CAA TGG CCA GCG GAT GAA TTG TTA ATG 241 
A^n lys 111 GlS p5o GlS Trp Pro Ala Asp Glu Leu Leu Met 
5 10 •'•^ 

AAA TAG CAA TAG ATC TCC GAT TTC TTC ATT GCG ATT GCG TAT 283 
Lys T^r Gin Tyr lie Ser As| Phe Phe He Ala lie Aia iyr 

TTT TCG ATT CCT CTT GAG TTG ATT TAG TTT GTG AAG AAA TCA 325 
Phe sir 111 ?ro Lei GlS Leu lie Tyr Phe Val Lys Lys Ser 



35 



GCC GTG TTT CCG TAT AGA TGG GTA CTT GTT CAG TTT GGT GCT 367 
Ala Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Giy Aia 
50 



409 



TTT ATC GTT CTT TGT GGA GCA ACT CAT CTT ATT AAC TTA TGG 
Phe Ili vll III c|| Gly Ala Thr His Leu He Asn Leu Trp 

ACT TTC ACT ACG CAT TCG AGA ACC GTG GCG CTT GTG ATG ACT 451 
Thr Phe Thr Thr His Ser Arg Thr Val Ala Leu Val Met Thr 
75 80 

ACC GCG AAG GTG TTA ACC GCT GTT GTC TCG TGT GCT ACT ACG 493 
Thr Ala Lys Val Leu Thr Ala Val Val Ser Cys Ala Tnr inr 
90 95 ■'■"^ 

TTr ATG CTT GTT CAT ATT ATT CCT GAT CTT TTG AGT GTT AAG 535 
Hi mII lII vIl Hil 111 lie Pro ASP Leu Leu Ser Val Lys 

105 

ACT CGG GAG CTT TTC TTG AAA ^AT AAA GCT GCT GAG CTC GAT 577 
Thr Arg Glu Leu Phe Leu Lys Asn L^| Ala Ala Glu Leu Asg 

AGA GAA ATG GGA TTG ATT CGA ACT gAG GAA GAA ACC GGA AGG 619 
Arg Glu Met Gly Leu He Arg Thr Gin Glu Glu Tnr t.iy iurg 

X 3 5 

CAT GTG AGA ATG TTG ACT CAT GAG ATT AGA AGC ACT TTA GAT 661 
His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 

m gti ^s? !n rji fi III m in 

160 

APA TTR GCT TTG GAG GAG TGT GCA TTG TGG ATG CCT ACT AGA 745 
Th^ lln All llu Glu GlS Cys Ala Leu Trp Met Pro Thr Arg 

175 180 
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ACT GGG TTA GAG CTA CAG CTT TCT TAT ACA CTT CGT CAT CAA 787 
Thr Gly Leu Glu Leu Gin Leu Ser Tyr Thr Leu Arq H^s Gin 
190 i§5 ^ 200 

CAT CCC GTG GAG TAT ACG GTT CCT ATT CAA TTA CCG GTG ATT 829 
His Pro Val Glu Tyr Thr Val Pro He Gin Leu Pro Val lie 

205 210 

AAC CAA GTG TTT GGT ACT AGT AGG GCT GTA AAA ATA TCT CCT 871 
Asn Gin Val Phe Gly Thr Ser Arg Ala Val Lys He Ser Pro 
215 220 225 

AAT TCT CCT GTG GCT AGG TTG AGA CCT GTT TCT GGG AAA TAT 913 
Asn Ser Pro Val Ala Arg Leu Arg Pro Val Ser Gly Lys Tvr 
230 235 240 

ATG CTA GGG GAG GTG GTC GCT GTG AGG GTT CCG CTT CTC CAC 955 
Met Leu Gly Glu Val Val Ala Val Arg Val Pro Leu Leu His 
245 250 255 

CTT TCT AAT TTT CAG ATT AAT GAC TGG CCT GAG CTT TCA ACA 997 
Leu Ser Asn Phe Gin He Asn Asp Trp Pro Glu Leu Ser Thr 
260 265 270 

AAG AGA TAT GCT TTG ATG GTT TTG ATG CTT CCT TCA GAT AGT 1039 
Lys Arg Tyr Ala Leu Met Val Leu Met Leu Pro Ser Asp Ser 

275 280 ^ 

GCA AGG CAA TGG CAT GTC CAT GAG TTG GAA CTC GTT GAA GTC 1081 
Ala Arg Gin Trp His Val His Glu Leu Glu Leu Val Glu Val 
285 290 295 

GTC GCT GAT CAG GTG GCT GTA GCT CTC TCA CAT GCT GCG ATC il23 

Ala val Ala Leu Ser His Ala Ala He 
300 305 310 

CTA GAA GAG TCG ATG CGA GCT AGG GAC CTT CTC ATG GAG CAG 1165 
Leu Glu Glu Ser Met Arg Ala Arg Asp Leu Leu Met Glu Gin 
315 320 325 

AAT GTT GCT CTT GAT CTA GCT AGA CGA GAA GCA GAA ACA GCA 1207 
Asn Val Ala Leu Asp Leu Ala Arg Arg Glu Ala Glu Thr Ala 
330 335 340 

ATC CGT GCC CGC AAT GAT TTC CTA GCG GTT ATG AAC CAT GAA 1249 
He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn His Glu 

345 350 

ATG CGA ACA CCG ATG CAT GCG ATT ATT GCA CTC TCT TCC TTA 1291 
Met Arg Thr Pro Met His Ala He He Ala Leu Ser Ser Leu 
355 360 365 

CTC CAA GAA ACG GAA CTA ACC CCT GAA CAA AGA CTG ATG GTG 1333 
Glu Thr Glu Leu Thr Pro Glu Gin Arg Leu Met Val 
370 375 380 

GAA ACA ATA CTT AAA AGT AGT AAC CTT TTG GCA ACT TTG ATG 1375 
Glu Thr He Leu- Lys Ser Ser Asn Leu Leu Ala Thr Leu Met 
385 390 395 
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AAT GAT GTC TTA GAT CTT TCA AGG TTA GAA GAT GGA AGT CTT 1417 

Asn Asp Val Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu 
400 41U 

CAA CTT GAA CTT GGG ACA TTC AAT CTT CAT ACA TTA TTT AGA 1459 
Gin Leu Glu Leu Gly Thr Phe Asn Leu His Thr Leu Phe Arg 

415 420 

GAG GTC CTC AAT CTG ATA AAG CCT ATA GCG GTT GTT AAG AAA 1501 
Glu Val Leu Asn Leu He Lys Pro He Ala Val Val Lys Lys 
425 430 435 

TTA CCC ATC ACA CTA AAT CTT GCA CCA GAT TTG CCA GAA TTT 1543 
Leu Pro He Thr Leu Asn Leu Ala Pro Asp Leu Pro Glu Phe 
440 445 450 

GTT GTT GGG GAT GAG AAA CGG CTA ATG CAG ATA ATA TTA AAT 1585 
Val Val Gly Asp Glu Lys Arg Leu Met Gin He He Leu Asn 
455 460 4bo 

ATA GTT GGT AAT GCT GTG AAA TTC TCC AAA CAA GGT AGT ATC 1627 
He Val Gly Asn Ala Val Lys Phe Ser LyS Gin Gly Ser He 
470 475 480 

TCC GTA ACC GCT CTT GTC ACC AAG TCA GAC ACA CGA GCT GCT 1669 
Ser Val Thr Ala Leu Val Thr Lys Ser Asp Thr Arg Ala Ala 

485 490 

GAC TTT TTT GTC GTG CCA ACT GGG AGT CAT TTC TAC TTG AGA 1711 
Asp Phe Phe Val Val Pro Thr Gly Ser His Phe Tyr Leu Arg 
495 500 505 

GTG AAG GTA AAA GAC TCT GGA GCA GGA ATA AAT CCT CAA GAC 1753 
Val Lys Val Lys Asp Ser Gly Ala Gly He Asn Pro Gin Asp 
510 515 520 

ATT CCA AAG ATT TTC ACT AAA TTT GCT CAA ACA CAA TCT TTA 1795 
He Pro Lys He Phe Thr Lys Phe Ala Gin Thr Gin Ser Leu 
525 530 535 

GCG ACG AGA AGC TCG GGT GGT AGT GGG CTT GGC CTC GCC ATC 1837 
Ala Thr Arg Ser Ser Gly Gly Ser Glv Leu Gly Leu Ala He 
540 545 550 

TCC AAG AGG TTT GTG AAT CTG ATG GAG GGT AAC ATT TGG ATT 1879 
Ser Lys Arg Phe Val Asn Leu Met Glu Gly Asn He Trp He 

555 560 

GAG AGC GAT GGT CTT GGA AAA GGA TGC ACG GCT ATC TTT GAT 1921 
Glu Ser Asp Gly Leu Glv Lys Gly Cys Thr Ala He Phe Asp 
565 570 575 

GTT AAA CTT GGG ATC TCA GAA CGT TCA AAC GAA TCT AAA CAG 1963 
Val Lys Leu Gly He Ser Glu Arg Ser Asn Glu Ser Lys Gin 
580 585 

TCG GGC ATA CCG AAA GTT CCA GCC ATT CCC CGA CAT TCA AAT 2005 
sir Gly He pSo L^J vSl P^S Ala He Pro Arg His Ser Asn 

595 600 oU3 
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TTC ACT GGA CTT AAG GTT CTT GTC ATG GAT GAG AAC GGG GTA 2047 
Phe Thr Gly Leu Lys Val Leu Val Met Asp Glu Asn Glv Val 
610 615 ^ 620 

AGT AGA ATG GTG ACG AAG GGA CTT CTT GTA CAC CTT GGG TGC 2089 
Ser Arg Met Val Thr Lys Gly Leu Leu Val His Leu Glv Cvs 

625 630 

GAA GTG ACC ACG GTG AGT TCA AAC GAG GAG TGT CTC CGA GTT 2131 

Glu Val Thr Thr Val Ser Ser Asn Glu Glu Cys Leu Arg Val 
635 • 640 645 

GTG TCC CAT GAG CAC AAA GTG GTC TTC ATG GAC GTG TGC ATG 2173 

Val Ser His Glu His Lys Val Val Phe Met Asp Val Cys Met 
650 655 660 

CCC GGG GTC GAA AAC TAC CAA ATC GCT CTC CGT ATT CAC GAG 2215 
Pro Gly Val Glu Asn Tyr Gin lie Ala Leu Arg He His Glu 
665 670 675 

AAA TTC ACA AAA CAA CGC CAC CAA CGG CCA CTA CTT GTG GCA 2257 
Lys Phe Thr Lys Gin Arg His Gin Arg Pro Leu Leu Val Ala 
680 685 690 

CTC AGT GGT AAC ACT GAC AAA TCC ACA AAA GAG AAA TGC ATG 2299 
Leu Ser Gly Asn Thr Asp Lys Ser Thr Lys Glu Lys Cys Met 

695 700 

AGC TTT GGT CTA GAC GGT GTG TTG CTC AAA CCC GTA TCA CTA 2341 
Ser Phe Gly Leu Asp Gly Val Leu Leu Lys Pro Val Ser Leu 
705 710 715 

GAC AAC ATA AGA GAT GTT CTG TCT GAT CTT CTC GAG CCC CGG" 2383 
^5 -^9. Asp Val Leu Ser Asp Leu Leu Glu Pro Arg 
720 725 730 

GTA CTG TAC GAG GGC ATG TAAAGGCGAT GGATGCCCCA TGCCCCAGAG 2431 
Val Leu T^r Glu Gly Met 

GAGTAATTCC GCTCCCGCCT TCTTCTCCCG TAAAACATCG GAAGCTGATG 2481 

TTCTCTGGTT TAATTGTGTA CATATCAGAG ATTGTCGGAG CGTTTTGGAT 2531 

GATATCTTAA AACAGAAAGG GAATAACAAA ATAGAAACTC TAAACCGGTA 2581 

TGTGTCCGTG GCGATTTCGG TTATAGAGGA ACAAGATGGT GGTGGTATAA 2631 

TCATACCATT TCAGATTACA TGTTTGACTA ATGTTGTATC CTTATATATG 2681 

TAGTTACATT CTTATAAGAA TTTGGATCGA GTTATGGATG CTTGTTGCGT 2731 

GCATGTATGA CATTGATGCA GTATTATGGC GTCAGCTTTG CGCCGCTTAG 2781 

TAGAAC 2787 
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ACTTTTAAAA. TTTCTTTATT 


TCATTGTCAG 


AAAAAGAGAG 


CTAATAATAT 


•J yj 


TATTATTTAA ATGTAACAAG 


TAGGCCTATA 


ACACGTGAAC 


TTCCCTCTTT 


inn 

X u u 


GCAAAAAAAA AATCATCAAA 


AACTTTTACC 


TCTCATTGGT 


TTCTTCTTTA 


1 so 

-L J U 


TCACACTGTT ACGCTTGGAT 


TCTCATTTCT 


TCAAGTTCAT 


AACGCTCGGA 


900 


TCAATCAGGA AGACGAACTT 


GAACTTTCTT 


TTTTTCATCA 


TTACCCAAAG 


^ <J Vj 


CTATGAGGCT CACACCACCA ATACGTCCGC 


CGTCATGAAT 


CCTTCTCTTC 


300 


CAGGTACTGT GCCGTCTCGG 


GATAACAAAC 


TTTCTATTTA 


TTCTCTTCTG 


350 


ATCGGATCTA TCTATCGATG 


AAGATTGATT 


TCACTACTTT 


AGTAACATTT 


400 


CATCTGATCG ATCTGTGTTG 


TGTTATCGAG 


GAATCAATCT 


CATTTTGTAG 


** ou 


ATTCAATTTT CTGGATAGAT 


TTTGTATCTC 


TTTTCCATAG 


CTCTAGTCCA 


500 


AATCTAGTCT CCACTGATAT 


CTGAGTTTTG 


TTGACCAGGT 


CAACACAAGT 


550 


CAGAGCTCCA AAA ATG GAG 
Met Glu 
1 


TCA TGC GAT TGT TTT GAG ACG CAT 
Ser Cys Asp Cys Phe Glu Thr His 
5 10 


593 



CTG TTA GTG AAG TAG CAA TAG ATC TCA 635 
val Asn Gin Asp As| Leu Leu Val Lys Tjr Gin Tyr He Ser 

GAT GCG TTG ATT GCT CTT GCA TAG TTC TCA ATC CCA CTC GAG 677 
Asp Ala Leu He Ala Leu Ala Tyr Phe Ser He Pro Leu Glu 

30 35 

CTT ATC TAT TTC GTG CAA AAG TCT GCT TTC TTC CCT TAC AAA 719 
Leu lie Tyr Phe Val Gin Lys Ser Ala Phe Phe Pro Tyr Lys 
40 35 50 

TGG GTG CTT ATG CAG TTT GGA GCC TTT ATC ATT CTC TGT GGA 761 
Trp Val Leu Met Gin Phe Gly Ala Phe He He Leu Cys GlV 
55 60 65 

GCT ACG CAT TTC ATC AAC CTA TGG ATG TTC TTC ATG CAT TCC 803 
Ala Thr Hxs Phe He Asn Leu Trp Met Phe Phe Met His Ser 
70 75 80 

AAA GCC GTT GCC ATT GTC ATG ACT ATT GCT AAA GTC TCT TGC 845 
Lys Ala Val Ala He Val Met Thr He Ala Lys Val Ser Cvs 

85 90 ^ 

GCG GTT GTG TCG TGT GCT ACC GCG TTG ATG TTG GTT CAT ATT 887 
Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu Val His He 
95 100 105 

ATT CCT GAT CTT CTC AGT GTT AAG AAC AGG GAA TTG TTT CTC 929 
f?2 -^P J^eu Ser Val Lys Asn Arg Glu Leu Phe Leu 
liO 115 120 

AAG AAG AAA GCT GAT GAG TTA GAT AGA GAA ATG GGT CTT ATT 971 
Lys Lys Lvs Ala Asp Glu Leu Asp Arg Glu Met Gly Leu He 
125 130 135 
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TTA ACA CAA GAG GAG ACT GGT AGG CAT GTT AGG ATG CTT ACT 1013 

Leu Thr Gin Glu Glu Thr Gly Arg His Val Arg Met Leu Thr 
140 145 150 

CAT GGA ATT AGA AGA ACT CTT GAT AGG CAT ACT ATT TTA AGA 1055 
His Gly He Arg Arg Thr Leu Asp Arg His Thr He Leu Arg 

155 160 

ACC ACT CTT GTT GAG CTT GGT AAA ACT CTT TGT CTT GAG GAA 1097 

Thr Thr Leu Val Glu Leu Gly Lys Thr Leu Cys Leu Glu Glu 

165 170 1'5 

TGT GCG TTG TGG ATG CCT TCT CAA AGT GGT TTA TAT TTG CAG 1139 

Cys Ala Leu Trp Met Pro Ser Gin Ser Gly Leu T^r Leu Gin 

CTT TCT CAT ACT TTG AGT CAT AAA ATA CAA GTT GGA AGC AGT 1181 
Leu Ser His Thr Leu Ser His Lvs He Gin Val Gly Ser Ser 
195 200 ^^-^ 

GTG CCG ATA AAT CTC CCG ATT ATT AAT GAA CTC TTC AAT AGC 1223 
Val Pro He Asn Leu Pro He He Asn Glu Leu Phe Asn Ser 
210 215 ^■^U 

GCT CAA GCT ATG CAC ATA CCT CAT TCT TGT CCT TTG GCT AAG 1265 
Ala Gin Ala Met His He Pro His Ser Cvs Pro Leu Ala Lys 

225 230 

ATT GGG CCT CCG GTT GGG AGA TAT TCA CCT CCT GAG GTT GTT 1307 
He Gly Pro Pro Val Gly Arg Tyr Ser Pro Pro Glu Val Val 
235 240 245 

TCT GTC CGT GTT CCT CTT TTA CAT CTC TCT AAT TTC CAA GGC 1349 
Ser Val Arg Val Pro Leu Leu His Leu Ser Asn Phe Gin Gly 
250 255 . 260 

AGT GAC TGG TCG GAT CTC TCT GGC AAA GGT TAC GCT ATC ATG 1391 
Ser Asp Tr| Ser Asp Leu Ser Gljj Lys Gly Tyr Ala He Met 

GTC CTG ATT CTC CCA ACC GAT GGT GCA AGA AAA TGG AGA GAC 1433 
Val Leu He Leu Pro Thr Asp Gly Ala Arg Lys Trp Arg Aso 
280 285 

CAT GAG TTA GAG CTT GTA GAA AAC GTG GCG GAT CAG 1469 
His Glu Leu Glu Leu Val Glu Asn Val Ala Asp Gin 

295 300 

GTCCATCTCT TTACTTGTAT ATGTTTGGTT GTGTGTCAAG TTGCTTTACC 1519 

AGCTTTTAGT GTTTTGTTTT GTCCCCTGAC TCTCACTTCA TTCAG 1564 

GTG GCT GTG GCT CTC TCA CAT GCT GCA ATT TTG GAA GAA TCC 1606 
Val Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu Ser 
305 310 -'■'■^ 

ATG CAC GCT CGT GAC CAG CTT ATG GAG CAG AAT TTT GCT TTA 1648 
ult Hii All Arg A^p Gin Leu Met Glu Gin Asn Phe Ala Leu 

GAC AAG GCT CGT CAA GAG GCT GAG ATG GCA GTA CAT GCT CGA 1690 
Asp Lys Ala Arg Gin Glu Ala Glu Met Ala Val His Ala Arg 
f ' 340 
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AAT GAT TTC CTA GCT GTT ATG AAC CAC GAG ATG AGG ACA CCG 1732 
Asn Asp Phe Leu Ala Vai Met Asn His Glu Met Arg Thr Pro 

350 355 

S^'^ T^*^ ^"^"^ TCT TCT CTT CTC CTT GAG ACT 1774 

^et His Ala He He Ser Leu Ser Ser Leu Leu Leu Glu Thr 
360 365 370 

GAG CTG TCT CCA GAG CAA AGA GTT ATG ATC GAG ACA ATA CTG 1816 
Glu Leu Ser Pro Glu Gin Arg Val Met He Glu Thr He Leu 
375 380 385 

AAA AGC AGC AAT CTT GTG GCT ACA CTA ATC AGC GAC GTT CTG 1858 
Lys Ser Ser Asn Leu Val Ala Thr Leu He Ser Asp Val Leu 
390 395 400 

GAT CTT TCG AGA TTG GAA GAT GGG AGC TTA CTC TTG GAA AAT 19Qn 
Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Leu Leu Glu Asn 

405 410 

GAA CCA TTC AGT CTA CAA GCG ATC TTT GAA GAG GTAACTAAAT 1943 
Glu Pro Phe Ser Leu Gin Ala lie Phe Glu Glu 
"115 420 425 

CCCCCTGATT AACCAGTGAA GTCCATTATA TATGTCTTAC ATGAATAACA 1993 

TGGGCGCTTT GAATCTGCAG GTC ATC TCT TTG ATA AAG CCA ATC 2037 

Val He Ser Leu He Lys Pro He 

430 

GCA TCA GTG AAG AAA CTA TCA ACG AAT CTG ATT CTG TCT GCA 2079 
5fi Lys Leu Ser Thr Asn Leu He Leu Ser Ala 

435 440 445 

GAC TTA CCA ACT TAT GCT ATT GGT GAT GAG AAA CGT CTG ATG 2121 
Asp Leu Pro Thr Tyr Ala He Gly Asp Glu Lys Arg Leu Met 
450 455 460 

^ ^T'^ ATC ATG GGC AAC GCT GTG AAA TTT ACT 2163 

Gin Thr He Leu Asn He Met Gly Asn Ala Val Lys Phe Thr 
465 470 475 

AAG GAA GGC TAC ATC TCC ATA ATA GCC TCT ATC ATG AAA CCC 2205 
Lys Glu Gly Tyr He Ser He He Ala Ser He Met Lys Pro 

480 485 

GAG TCC TTA CAA GAA TTA CCA TCT CCA GAA TTT TTT CCA GTT 2247 
Glu Ser Leu Gin Glu Leu Pro Ser Pro Glu Phe Phe Pro Val 
490 495 500 

CTC AGT GAC AGT CAC TTC TAC CTA TGT GTG CAG GTTAGACCCA 2290 
Leu Ser Asp Ser His Phe Tyr Leu Cys Val Gin 
505 510 

ATCTACAAAT TACTAAACTA CAAAGTTAAG CTTCTTACTG TGTTCTTACT 2340 

GTTATAATCA TGGTGCAG GTG AAG GAC ACA GGG TGT GGA ATT CAC 2385 

Val Lys Asp Thr Gly Cys Gly He His 
515 520 

ACA CAA GAC ATT CCT TTG CTC TTT ACC AAA TTT GTA CAG CCT 2427 
""^ 5i5 Asp He Pro Leu Leu Phe Thr Lys Phe Val Gin Pro 
525 530 535 

FIG. 12C 



SUBSTTFUTE SHEET (RULE 26) 



wo 95/01439 



PCT/US94/07418 



33 / 65 

CGG ACC GGA ACT CAG AGG AAC CAT TCC GGT GGA GGA CTC GGG 24 69 
Arg Thr Gly Thr Gin Arg Asn His Ser Gly Giy Gly Leu Gly 
540 545 550 

CTA GCT CTC TGT AAA CGG TAACAACCC AAAAGTATAT ATAAGTTATA 2516 
Leu Ala Leu Cys Lys Arg 
555 

AGCAGATGGT GTTACAAATA GCTAAAAGGC AAGTTTCTGT TGATGGATGT 2566 

CTCTGGTTAG G TTT GTC GGG CTA ATG GGA GGA TAG ATG TGG 2607 
Phe Val Gly Leu Met Gly Gly Tyr Met Trp 
560 565 

ATA GAA AGT GAA GGC CTA GAG AAA GGC TGC ACA GCT TCG TTC 2649 
lie Glu Ser Glu Gly Leu Glu Lys Gly Cys Thr Ala Ser Phe 
570 575 580 

ATC ATC AGG CTT GGT ATC TGC AAC GGT CCA AGC AGT AGC ACT 2691 
He lie Arg Leu Gly He Cys Asn Glv Pro Ser Ser Ser Ser 
585 590 595 

GGT TCA ATG GCG CTA CAT CTT GCA GCT AAA TCA CAA ACC AGA 27.33 
Gly ser Met Ala Leu His Leu Ala Ala Lvs Ser Gin Thr Arg 
^ 600 605 

CCG TGG AAC TGG TGATACTTAC GTTGGAAAGA CTTGTATTGA 2775 

Pro Trp Asn Trp 

610 

GGTGAGACTT TTTAACTACA CAGCAGCAAG AGAAAGAAGA AAATACATGA 2825 

CCGGACGGTG TGATCTAACT TATTGGATTT TGTTGGATGT AATATGTAAA 2875 

ATAAAAATCC TATATACGGG GAGAGGTACC TTATCTGTTC TCACTATATT 2925 

TTATTGAACA TTACTTTAGA GAATATGTTT TGGAATTCAC TACTAAATAA 2975 

ACGATATAAA TCTTCACGAA AAGAGCAACA TTTT 3009 



r 



FIG. 12D 

SUBSTITUTE SHEET (RULE 26) 



wo 95/01439 

PCT/US94/07418 



34 / 65 

AAAAAAATCA TCAAAAACTT TTACCTCTCA TTGGTTTCTT CTTTATCACA 50 

CTGTTACGCT TGGATTCTCA TTTCTTCAAG TTCATAACGC TCGGATCAAT 100 

CAGGAAGACG AACTTGAACT TTCTTTTTTT CATCATTACC CAAAGCTATG 150 

AGGCTCACAC CACCAATACG TCCGCCGTCA TGAATCCTTC TCTTCCAGGT 200 

CAACACAAGT CAGAGCTCCA AAA ATG GAG TCA TGC GAT TGT TTT 244 

Met Glu Ser Cys As| Cys Phe 

S^^ 9'^9 GAT CTG TTA GTG AAG TAG CAA 286 

Glu Thr His Val Asn Gin Asp Asp Leu Leu Val Lys Tyr Gin 
10 15 20 

TAG ATC TCA GAT GCG TTG ATT GCT GTT GCA TAG TTC TCA ATG 328 
Tyr He Ser Asp Ala Leu He Ala Leu Ala Tyr Phe Ser lie 
25 30 .35 

CCA CTG GAG CTT ATC TAT TTG GTG CAA AAG TGT GCT TTC TTG 370 
Pro Leu Glu Leu He Tyr Phe Val Gin Lys Ser Ala Phe Phe 

40 45 

CCT TAG AAA TGG GTG CTT ATG GAG TTT GGA GCG TTT ATG ATT 417 
Pro Tyr Lys Trp Val Leu Met Gin Phe Gly Ala Phe He He 

55 60 

CTG TGT GGA GCT ACG CAT TTC ATC AAG CTA TGG ATG TTG TTC 454 
Leu Cys Gly Ala Thr His Phe He Asn Leu Trp Met Phe Phe ' 
65 70 75 

ATG CAT TGG AAA GCG GTT GCG ATT GTG ATG ACT ATT GCT AAA 496 
Met His Ser Lys Ala Val Ala He Val Met Thr He Ala Lys 
80 85 90 

GTG TGT TGG GCG GTT GTG TGG TGT GCT ACG GCG TTG ATG TTG 538 
Val Ser Cys Ala Val Val Ser Cys Ala Thr Ala Leu Met Leu 
95 100 105 

GTT CAT ATT ATT GCT GAT GTT CTG AGT GTT AAG AAG AGG GAA 580 
Val His He He Pro Asp Leu Leu Ser Val Lys Asn Ara Glu 

110 115 

TTG TTT CTG AAG AAG AAA GCT GAT GAG TTA GAT AGA GAA ATG 622 

l-ys ^ys Lys Ala Asp Glu Leu Asp Arg Glu Met 
120 125 

GGT CTT ATT TTA AGA GAA GAG GAG AGT GGT AGG GAT GTT AGG 664 
Gly Leu He Leu Thr Gin Glu Glu Thr Gly Arg His Val Aro 
135 140 145 

ATG CTT AGT CAT GGA ATT AGA AGA AGT CTT GAT AGG CAT ACT 706 
Met Leu Thr His Gly He Arg Arg Thr Leu Asp Arg His Thr 
150 153 160 

ATT TTA AGA ACC AGT CTT GTT GAG CTT GGT AAA ACT CTT TGT 748 
He Leu Arg Thr Thr Leu Val Glu Leu Gly Lys Thr Leu Gvs 
165 170 lifs 
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CTT GAG GAA TGT GCG TTG TGG ATG CCT TCT CAA. ACT GGT TTA 7 90 

Leu Glu Glu Cys- Ala Leu Trp Met Pro Ser Gin Ser Gly Leu 

180 185 

TAT TTG CAG CTT TCT CAT ACT TTG AGT CAT AAA ATA CAA GTT 832 

Tyr Leu Gin Leu Ser His Thr Leu Ser His Lys He Gin Val 

150 195 200 



GGA AGC AGT GTG CCG ATA AAT CTC CCG ATT ATT AAT GAA CTC 874 
Gly Ser Ser Val Pro He Asn Leu Pro He He Asn Glu Leu 
205 210 215 

TTC AAT AGC GCT CAA GCT ATG CAC ATA CCT CAT TCT TGT CCT 916 
Phe Asn Ser Ala Gin Ala Met His He Pro His Ser Cys Pro 
220 225 230 

TTG GCT AAG ATT GGG CCT CCG GTT GGG AGA TAT TCA CCT CCT 958 
Leu Ala Lys He Gly Pro Pro Val Gly Arg Tyr Ser Pro Pro 
235 240 245 

GAG GTT GTT TCT GTC CGT GTT CCT CTT TTA CAT CTC TCT AAT 1000 
Glu Val Val Ser Val Arg Val Pro Leu Leu His Leu Ser Asn 

250 255 

TTC CAA GGC AGT GAC TGG TCG GAT CTC TCT GGC AAA GGT TAC 1042 
Phe Gin Gly Ser Asp Trp Ser Asp Leu Ser Gly Lys Gly Tyr 
260 265 270 

GCT ATC ATG GTC CTG ATT CTC CCA ACC GAT GGT GCA AGA AAA 1084 
Ala He Met Val Leu He Leu Pro Thr Asp Gly Ala Arg Lys 
275 280 285 

TGG AGA GAC CAT GAG TTA GAG CTT GTA GAA AAC GTG GCG GAT 1126 
Trp Arg Asp His Glu Leu Glu Leu Val Glu Asn Val Ala Asp 
290 295 300 

CAG GTG GCT GTG GCT CTC TCA CAT GCT GCA ATT TTG GAA GAA 1168 
Gin Val Ala Val Ala Leu Ser His Ala Ala He Leu Glu Glu 
305 310 315 

TCC ATG CAC GCT CGT GAC CAG CTT ATG GAG CAG AAT TTT GCT 1210 
Ser Met His Ala Arg Asp Gin Leu Met Glu Gin Asn Phe Ala 

320 325 

TTA GAC AAG GCT CGT CAA GAG GCT GAG ATG GCA GTA CAT GCT 1252 
Leu Asp Lys Ala Arg Gin Glu Ala Glu Met Ala Val His Ala 
330 335 340 

CGA AAT GAT TTC CTA GCT GTT ATG AAC CAC GAG ATG AGG ACA 1294 
Arg Asn Asp Phe Leu Ala Val Met Asn His Glu Met Arg Thr 
345 350 355 

CCG ATG CAT GCC ATC ATC TCT CTT TCT TCT CTT CTC CTT GAG 1336 
Pro Met His Ala He He Ser Leu Ser Ser Leu Leu Leu Glu 
360 365 370 

ACT GAG CTG TCT CCA GAG CAA AGA GTT ATG ATC GAG ACA ATA 1378 
Thr Glu Leu Ser Pro Glu Gin Arg Val Met He Glu Thr He 
375 380 38o 
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^ c^*^ ^"^T ^"^G ACA CTA ATC AGC GAC GTT 1420 

Leu Lys Ser Ser Asn Leu Val Ala Thr Leu He Ser Asp Va' 

390 395 

CTG GAT CTT TCG AGA TTG GAA GAT GGG AGC TTA CTC TTG GAA 14 62 
Leu Asp Leu Ser Arg Leu Glu Asp Gly Ser Leu Leu Leu Glu 

405 410 

AAT GAA CCA TTC AGT CTA CAA GCG ATC TTT GAA GAG GTC ATC 1504 
Asn Glu Pro Phe Ser Leu Gin Ala He Phe Glu Glu Val lie 
415 420 425 

TCT TTG ATA AAG CCA ATC GCA TCA GTG AAG AAA CTA TCA ACG 154 6 
Ser Leu lie Lys Pro He Ala Ser Val Lys Lys Leu Ser Thr 
430 435 

AAT CTG ATT CTG TCT GCA GAC TTA CCA ACT TAT GCT ATT GGT 1588 
Asn Leu He Leu Ser Ala Asp Leu Pro Thr Tyr Ala He Glv 
445 450 455 

GAT GAG AAA CGT CTG ATG CAA ACA ATT CTT AAC ATC ATG GGC 1630 
Asp Glu Lys Arg Leu Met Gin Thr He Leu Asn He Met Glv 

460 465 

f^n ^ 11"^ ^9*^ ^ GGC TAC ATC TCC ATA ATA 1672 

Asn Ala Val Lys Phe Thr Lys Glu Gly Tyr He Ser He He 

475 480 

GCC TCT ATC ATG AAA CCC GAG TCC TTA CAA GAA TTA CCA TCT 1714 
5li Pro Glu Ser Leu Gin Glu Leu Pro Ser 

485 490 495 

CCA GAA TTT TTT CCA GTT CTC AGT GAC AGT CAC TTC TAC CTA- 1756 
Pro Glu Phe Phe Pro Val Leu Ser Asp Ser His Phe Tyr Leu 
500 505 510 

§1? T^^? ^9^ "^GT ATT CAC ACA CAA ^ 1798 

Cys Val Gin Val Lys Asp Thr Gly Cys Gly He His Thr Gin 
515 520 525 

GAC ATT CCT TTG CTC TTT ACC AAA TTT GTA CAG CCT CGG ACC 1840 
Asp He Pro Leu Leu Phe Thr Lys Phe Val Gin Pro Arg Thr 

530 535 

GGA ACT CAG AGG AAC CAT TCC GGT GGA GGA CTC GGG CTA GCT 1882 
l\l Thr Gin Arg Asn Hi| Ser Gly Gly Gly L|u Gly Leu Ala 

CTC TGT AAA CGG TTT GTC GGG CTA ATG GGA GGA TAC ATG TGG 1924 
ill 56? ^"""^ ^"^^ 5^5 

ATA GAA AGT GAA GGC CTA GAG AAA GGC TGC ACA GCT TCG TTC 1966 
He Glu Ser Glu Gly Leu Glu Lys Gly Cys Thr Ala Ser Phe 
570 575 580 

ATC ATC AGG CTT GGT ATC TGC AAC GGT CCA AGC AGT AGC AGT 2008 
He He Arg Leu Gly He Cys Asn Gly Pro Ser Ser Ser Ser 
585 590 595 
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GGT TCA ATG GCG CTA CAT CTT GCA GCT AAA TCA CAA ACC AGA 2050 
Gly Ser Met Ala Leu His Leu Ala Ala Lys Ser Gin Thr Arg 

600 605 

CCG TGG AAC TGG TGATACTTAC GTTGGAAAGA CTTGTATTGA 2092 

Pro Trp Asn Trp 

610 

GGTGAGACTT TTTAACTACA CAGCAGCAAG AGAAAGAAGA AAATACATGA 2142 

CCGGACGGTG TGATCTAACT TATTGGATTT TGTTGGATGT AATATGTAAA 2192 

ATAAAAATCC TATATACGGG GAGAGGTACC TTATCTGTTC TCACTATATT 2242 

TTATTGAACA TTACTTTAGA GAATATGTTT TGGAATTCAC TACTAAATAA 2292 

ACGATATAAA TCTTCACGAA AA 2314 
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GAATTCGAAC TGCAATGGGA TAAACATTAT ATGCGTTTTA ATAATAGGTT 50 

GGTGAAGTTT ATAATTTACA CCATTTGAAA AGCCTTCCAA ATTTAGAAAC 100 

TACATTTTTG CAGACCCATG TGAGCTCATA TGAATCAATC ATAGCCTTGA 150 

TGTTGTAAAA CAAATTATGA TTATAAAAAT GTGATAGTAT ATTACATGCA 200 

TAAAAAATAA AGGAGAGTAA ATGAAAGTCA AATCTGGGTT TTATGAACTG 250 

AAAGTTGAAG TTTAGAAGTA GAAGTAGCGA TCAAAGTATG ACCAGTTAAA 300 

AGGCCCAATA TCATTTGGAG GTTTGATTTT TGGGTTCGTA AATTTCAAGA 350 

GCCAGATTAT GATTTGCTGG GCTTAAAAAT CATGGAAAAA TTGAAATGAC 400 

GGTGTTAAAA TATATAACTC AAATTAAAGA TTTTAATTGG GTGTAGTAGG 450 

CTGATTTTTT TATAAGAATC TTGTCTATAG ATGCTTCAAG GTTATGCCTT 500 

ATAGTACTGG TTGTAAAACA CCACTATCTA ATTTTGAAGC TGGTCAGAAC 550 

TATAAGGTAT GTTGTTGTTC GCCTTGTTGC TAATGAAGAT TATAACATTC 600 

TGTTGTTGCA TTTTTTTTTT TTTTTTTGTG TTAAATATAT ATATTTTTTT 650 

TGCATATTTA TTGTTGCATA TTGTGTTGCA TATTTAGTAA TGGTTACATT 700 

CCCTGTTATC GGAGACCAAG ATAATACGGC TCTGTGGCAT GGACTACTAC 750 

TCCATGGATT CTTCCAAGTA ATCTTGCTTT GTGTGTCAAT GCAAAGTTTG 800 

TTTATCTTAA GGTTCGTCAA CAACACTGGA AAAGTCTACA TTGTTGCTGA 850 

ATCTCGGTTG TCATCGCTTC CTAGTGATAA GCCTAAGGCC GGCTTAACTA 900 

ATGGAACTTA CTAGTGATAC CATAATGCGA AAGGTGCTAA TTAAGCTTGA 950 

CAGTGAAGAG GATTCTTATC AAGTTTTGGA AAATTTTAAT GGAGATTCCT 1000 

TGGTTGGGAA GAAGTATGAA CCTTTGTTTG ATTACTTTTA GCGATTTCTC 1050 

AAGTGTGACT TTTCGACTAG TAGCAGATGA TTATGTCATG AATGATAGTG 1100 

GTACTGGTAT TGTCCATTGT GCTCCTGTCT TTGGTGCAGA TGACTATCGT 1150 

GTTTGTCTTG AGAACGAGAT AATTAAGAAG GTTAGATTTG ACAACATCTT 1200 

CCTTATATCA CCACCTTTAA CATTAAGTTT ATTTTCTTTC TTGTTTAAGT 1250 

TTACAGTATC TTCAAGAACC CATGTTCATG ACACATTTTG TTCATGTGTT 1300 

GTTTAGATTG TCAGAGATTT CAAACGTCCA GATGGTTTGA AAGATACAGA 1350 

GATTGATGCA GCTGTAGATA GTACATATCT TAATTAAAAA TACCACTTCT 1400 

CTATGCTCTA TTGTTGAGGA AACATATAAT ATTTGCATTC GTTCATGGTT 1450 

CAGATATGAT GTTATGGTAA TTCTTGATCT ACGAGAAGAT GAATCTTTGA 1500 

AAAACGAAGG TGTTGCCCGT GAGGTAAATA AATGTAACCG AAGCGATTAA 1550 

TGGTCATATA TAAGTTGTAT ATTTGATATA TGGGTTTCCT TCTCATTGTG 1600 
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CTCATGCATT GAAAAGCACC CTGTTATGAC TGTGGTTCTA GGAGAACATT 1650 

TGCATTTGAC AGTCGGTGAC TAATTGTTAA GCAAGAAGAA CGCATGAGAG 17 00 

CCTTTTAAAG TGTTTTCTTC TAGATCGTTG CAAAAAGTTA AATGTCTCTT 1750 

GAGACTTTGT ACTCATTCTA TAGATAAAGA TGGGATTTAT TACAAAAACA 1800 

ACAAGAAACT TTGTTACTTG TGGAAATTCA AAATTATCCG AACTAGCTTC 1850 

ACAAAATATG CTCAAGAGTT TCAATGTATT TTTTTTTGTT CTGTAATTGT 1900 

ATGACTCCGT TTGAAGCATC AAGATTATGG TTATAGGTAG TGATGCTAAA 1950 

ACTCTCTGTT GTTACAGTGA CCACTAAAAA CACCAACAAA AAAAACTTAG 2000 

GTAACGTGTC GTCTAAAAAC TTCTAGGTTC AATTTCTTTA GATAGTACTA 2050 

TCAATAAATA AAATAAATAT GTACAAAGGC TTTAAACAAT GATGTTTTTC 2100 

AAAGATGATT GGTAGATACT AATTAGAGCT TCAATATAAA AGAACACATG 2150 

CGATTCTGAC ATTCTGTGGT GTAACATGGT TTCTTCTAGA GTCAAAACCA 2200 

TACAATTAAA AGTTAGGAAA GTAATAGCAA TGTGGTTTCA AATATATACT 2250 

CATTACTCTT TAGATTCATG TATGGTGAAG GAAACATTAT AATAAAATCA 2300 

AAGATCACAG TTTTGTAGGT CCCTCATATT AATCAACATC TTAAGGCGTT 2350 

ATACATATCT TCTTTTTGTA AATATTTGAC TAATTAAAAT ATCTAATTAG 2400 

AGTATTAGAC TAATCTCATC AAATATCCGA CTACTTGTGT CAGTTCAAAA 2450 

CACAGTGATT ACGTTAGATT TTGTGCTCTT TTGTTTATAA ACAAAGCTAA 2500 

TTTAAGAAAT ATATGATCTA TTTGCCTCCT TGGTCTTAAT TTTATACTTT 2550 

CTTGGAATAA AACACATTTA TTAAAATAAT TTTTAGGGTC CTAGATTCAT 2600 

GTCATGTGGC TTGATAGTTT CCAACAATTA TACCAATATT TTACTCATTC 2650 

ATATACAAAT AAACAAGCTT TATTCTATTC TTCAGTCTCA TGATATACGG 2700 

GATTTTGATA AAATTCAGAG TACCCATTAA TTATTCTATG TTACAGCTTG 2750 

TAATAAGTTA AATTTATAAA ACGTACAAGT TGAGGAAATA ACAAATGTTT 2800 

TCAATATTAA ATGATTTATT AATACATTAG TGACCAAAAA ATTATTAAGT 2850 

GTAAGAAAAA AAACACAACT CAGAAAAAAT TCAAAAGACC GTCTAAGTTC 2900 
GGTTCATGTA AGAACAAGTG GGACCTCTTT AAGTTTCTAA ATCAGAGAAT 2950 
AAAGAAGAAG AAAAAATCTC AAAACCTTCC TCTAAAACCA ACGGCTCCTA 3000 
CCTTTACTTA CACCCTATAC ATACACTTCT CTTTTTATCC TCCATCGGCG 3050 
GCTTATGGCG GTTTTCCGGC ACTAATCATC TCCGGCATAT ATAAATAAAC 3100 
GTACTTCACG TTTTTTTATA TAACTTCAAA GTAGTTTCAG ATTTGTCTCT 3150 
ATCTCTTCAC TTTTAAGTCT TCTGGTTTTG TCATCACCAG CTTTTTTTGT 3200 
TCTCTCTCTG TCTCTGTCTC TGTCTTTCTC TTTGTGTATT TTTATTCTCG 3250 
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TCATCGTTGT TCTTCTATGA GAGGAAGATC GGAATGTCGA 

AAGATTCTCG TACATCACTT CGTTGGAATT TCACAGGTCG 

TGAGAACTGT TTCATTTTGA TCCAAACTCA TCTCTTTCAG 

TTTGTCTTTC TCTGTTCTTT CTACTATTAC CCAAATTAAA 

TTATTTCTCA CTCTGTTTCT TGTTTTTCTA ATTGCAGAGT 

AAGCATTTTT TTTCTCCGAA G ATG GTT AAA GAA ATA 

Met Val Lys Glu He 
1 5 

TTA TTG ATA CTA TCA ATG GTG GTG TTT GTT TCT 
Leu Leu He Leu Ser Met Val Val Phe Val Ser 
10 15 

GCT ATA AAC GGC GGT GGT TAT CCA CGA TGT AAC 
Ala He Asn Gly Gly Gly Tyr Pro Arg Cys Asn 

GAA GGA AAC AGT TTC TGG AGT ACA GAG AAC ATT 
Glu Gly Asn Ser Phe Trp Ser Thr Glu Asn He 
40 45 

CAA AGA GTA AGC GAT TTC TTA ATC GCA GTA GCT 
Gin Arg Val Ser Asp Phe Leu He Ala Val Ala 

55 60 

ATC CCT ATT GAG TTA CTT TAC TTC GTG AGT TGT 
lie Pro He Glu Leu Leu Tyr Phe Val Ser Cys 
65 70 75 

CCA TTC AAA TGG GTT CTC TTT GAG TTT ATC GCC 
Pro Phe Lys Trp Val Leu Phe Glu Phe He Ala 
80 85 

CTT TGT GGT ATG ACT CAT CTT CTT CAT GGT TGG 
Leu Cys Gly Met Thr His Leu Leu His Gly Trp 
95 100 

GCT CAT CCA TTT AGA TTA ATG ATG GCG TTT ACT 
Ala His Pro Phe Arg Leu Met Met Ala Phe Thr 
110 115 

ATG TTG ACT GCT TTA GTC TCT TGT GCT ACT GCG 
Met Leu Thr Ala Leu Val Ser Cys Ala Thr Ala 

125 130 

ATT ACT TTG ATT CCT CTG CTT TTG AAA GTT AAA 
He Thr Leu He Pro Leu Leu Leu Lys Val Lys 
135 140 145 

TTT ATG CTT AAG AAG AAA GCT CAT GAG CTT GGT 
Phe Met Leu Lys Lys Lys Ala His Glu Leu Gly 
150 155 

GGT TTG ATT TTG ATT AAG AAA GAG ACT GGC TTT 
Gly Leu He Leu He Lys Lys Glu Thr Gly Phe 
165 170 



AGAGAATTAG 3300 

ATGAGAGATC 3350 

GTATTCCAAA 3400 

GTTTTGATTT 3450 

ATAATGGACT 3500 

GCT TCT TGG 3545 
Ala Ser Trp 

CCG GTT TTA 3587 
Pro Val Leu 
20 

TGC GAA GAC 3629 
Cys Glu Asp 
35 

CTA GAA ACT 3671 
Leu Glu Thr 
50 

TAT TTC TCA 3713 
Tyr Phe Ser 

TCC AAT GTT 3755 
Ser Asn Val 



TTC ATT GTT 3797 
Phe He Val 
90 

ACT TAC TCT 3839 
Thr Tyr Ser 
105 

GTT TTC AAG 3881 
Val Phe Lys 
120 

ATT ACG CTT 3923 
He Thr Leu 



GTT AGA GAG 3965 
Val Arg Glu 



CGT GAA GTT 4007 
Arg Glu Val 
160 

CAT GTT CGT 4049 
His Val Arg 
175 
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ATG CTT ACT CAA GAG 
Met Leu Thr Gin Glu 
180 

ATT CTT TAT ACT ACT 
lie Leu Tyr Thr Thr 

195 

TTG CAG AAT TGT GCG 
Leu Gin Asn Cys Ala 
205 
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ATT CGT AAG TCT TTG GAT CGT CAT ACG 4091 

lie Arq Lys Ser Leu Asp Arg His Thr 
^ 185 190 

TTG GTT GAG CTT TCG AAG ACT TTA GGG 4133 
Leu Val Glu Leu Ser Lys Thr Leu Gly 

200 

GTT TGG ATG CCG AAT GAC GGT GGA ACG 4175 
Val Trp Met Pro Asn Asp Gly Gly Thr 
210 2lS 

CAT GAG TTG AGA GGG AGA GGT GGT TAT 4217 



GAG ATG GAT TTG ACT w«w ..w.. - — - --- --- 

Glu Met Asp Leu Thr His Glu Leu Arg Gly Arg Gly Gly Tyr 
220 ^'"^ 230 



GGT GGT TGT TCT GTT 
Gly Gly Ser Val 

ATT AGG GAG AGT GAT 
lie Arg Glu Ser Asp 
250 

TCC ATT GCT CGA GCT 
Ser lie Ala Arg Ala 

265 

ATT GGT GCC GTG GCT 
He Gly Ala Val Ala 
275 

TCG GAT TTT AAT GGA 
Ser Asp Phe Asn Gly 
29D 

GTT TTA CCG GGC GGG 
Val Leu Pro Gly Gly 
305 

ATT GAG ATT GTT AAA 
He Glu He Val Lys 
320 

TTA GAT CAT GCA GCG 
Leu Asp His Ala Ala 

335 

GAG AAG CTG GCG GAA 
Glu Lys Leu Ala Glu 
345 

AGA GAC GCG TTG AGA 
Arg Asp Ala Leu Arg 
36D 

AAA ACG ATG AGC GAA 
Lys Thr Met Ser Glu 
375 

CTC GGT CTT TTG TCG 
Leu Gly Leu Leu Ser 
390 



225 

TCT ATG GAG GAT TTG GAT GTT GTT AGG 
Ser Met Glu Asp Leu Asp Val Val Arg 
240 245 

GAA GTG AAT GTG TTG AGT GTT GAC TCG 
Glu Val Asn Val Leu Ser Val Asp Ser 
255 260 

AGT GGT GGT GGT GGG GAT GTT AGT GAG 4343 



Ser Gly Gly Gly Gl^ Asp Val Ser Glu 

GCT ATT AGA ATG CCG ATG CTT CGT GTT 
Ala He Arg Met Pro Met Leu Arg Val 
280 285 

GAG CTA AGT TAT GCG ATA CTT GTT TGT 
Glu Leu Ser Tyr Ala He Leu Val Cys 
295 300 

ACC CGT CGG GAT TGG ACT TAT CAG GAG 
Thr Arg Arg Asp Trp Thr Tyr Gin Glu 

GTT GTG GCG GAT CAA GTA ACC GTT GCG 
Val Val Ala Asp Gin Val Thr Val Ala 
325 330 

GTT CTT GAA GAG TCT CAG CTT ATG AGG 
Val Leu Glu Glu Ser Gin Leu Met Arg 

340 

CAG AAC AGG GCG TTG CAG ATG GCG AAG 
Gin Asn Arg Ala Leu Gin Met Ala Lys 
350 355 

GCG AGC CAA GCG AGG AAT GCG TTT CAG 
Ala Ser Gin Ala Arg Asn Ala Phe Gin 
365 370 

GGG ATG AGG CGT CCT ATG CAT TCG ATA 
Gly Met Arg Arg Pro Met His Ser He 
^ 380 385 

ATG ATT CAG GAC GAG AAG TTG AGT GAC 
Met He Gin Asp Glu Lys Leu Ser Asp 
395 40D 



4259 



4301 



4385 

4427 

4469 

4511 

4553 

4595 

4637 

4679 

4721 



FIG. 14D 



SUBSTITUTE SHEET (RULE 26) 



wo 95/01439 



PCT/US94/07418 



42 / 65 

GAG CAG AAA ATG ATT GTT GAT ACG ATG GTT AAA ACA GGG AAT 
Glu Gin Lys Met lie Val Asp Thr Met Val Lys Thr Glv Asn 

405 410 ^ 

GTT ATG TCG AAT TTG GTG GGG GAG TCT ATG GAT GTG CCT GAG 
Val Met Ser Asn Leu Val Gly Asp Ser Met Asp Val Pro Asp 
415 420 425 

GGT AGA TTT GGT ACG GAG ATG AAA CCG TTT AGT GTG CAT CGT 
Gly Arg Phe Gly Thr Glu Met Lys Pro Phe Ser Leu His Ara 
430 435 440 

ACG ATC CAT GAA GCA GCT TGT ATG GCG AGA TGT TTG TGT CTA 
Thr lie His Glu Ala Ala Cys Met Ala Arg Cys Leu Cvs Leu 
445 450 455 

TGC AAT GGA ATT AGG TTC TTG GTT GAC GCG GAG AAG TCT CTA 
Cys Asn Gly lie Arg Phe Leu Val Asp Ala Glu Lys Ser Leu 
460 465 470 

CCT GAT AAT GTA GTA GGT GAT GAA AGA AGG GTC TTT CAA GTG 
Pro Asp Asn Val Val Gly Asp Glu Arg Arg Val Phe Gin Val 

475 480 

ATA CTT CAT ATG GTT GGT AGT TTA GTA AAG CCT AGA AAA CGT 
lie Leu His Met Val Gly Ser Leu Val Lys Pro Arg Lys Ara 
485 490 495 



CAA GAA GGA TCT TCA TTG ATG TTT AAG GTT TTG AAA GAA AGA 
Gin Glu Gly Ser Ser Leu Met Phe Lys Val Leu L^s Glu Arg 

GGA AGC TTG GAT AGG AGT GAT CAT AGA TGG GCT GCT TGG AGA 
Gly Ser Leu Asp Arg Ser Asp His Arg Trp Ala Ala Trp Ara 
515 .520 525 

TCA CCG GCT TCT TCA GCA GAT GGA GAT GTG TAT ATA AGA TTT 
Ser Pro Ala Ser Ser Ala Asp Gly Asp Val Tyr lie Arg Phe 
530 535 540 

GAA ATG AAT GTA GAG AAT GAT GAT TCA AGT TCT CAA TCA TTT 
Glu Met Asn Val Glu Asn Asp Asp Ser Ser Ser Gin Ser Phe 

545 550 

GCT TCT GTT TCC TCC AGA GAT CAA GAA GTT GGT GAT GTT AGA 
Ala Ser Val Ser Ser Arg Asp Gin Glu Val Gly Asp Val Aro 
555 560 565 ^ 

TTC TCC GGC GGC TAT GGG TTA GGA CAA GAT CTA AGC TTT GGT 
Phe Ser Gly Gly Tyr Gly Leu Gly Gin Asp Leu Ser Phe Gly 
570 575 580 

GTT TGT AAG AAA GTG GTG CAG GTGAGTTTCC TTACATATCT 
Val Cys Lji Lys Val Val Gin 

CTTTCTAAAG TTCCTGTCAT TAGTCTGAGT TTCTGTTTAG GAGTTCTTTG 5359 



4763 
4805 
4847 
4889 
4931 
4973 
5015 
:5057 
,5099 
5141 
5183 
5225 
5266 
5316 
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ATAATGTGTG CAG TTG ATT CAT GGG AAT ATC TCG GTG GTC CCT 

Leu lie His Gly Asn lie Ser Val Val Pro 
590 595 

GGC TCG GAT GGT TCA CCG GAG ACC ATG TCG TTG CTC CTT CGG 
Gly Ser Asp Gly Ser Pro Glu Thr Met Ser Leu Leu Leu Arg 
600 605 610 

TTT CGA CGT AGA CCC TCC ATA TCT GTC CAT GGA TCC AGC GAG 
Phe Arg Arg Arg Pro Ser He Ser Val His Gly Ser Ser Glu 
615 620 625 

TCG CCA GCT CCT GAC CAC CAC GCT CAC CCA CAT TCG AAT TCT 
ser Pro Ala Pro Asp His His Ala His Pro His Ser Asn Ser 
630 635 

CTG TTA CGT GGC TTA CAA GTT TTA TTG GTA GAC ACC AAC GAT 
lIu lIu A?5 Gi| LeS Gin Val Leu L|u Val Asp Thr Asn As| 

TCG AAC CGG GCA GTT ACA CGT AAA CTC TTA GAG AAA CTC GGG 
Ser Asn Arg Ala Val Thr Arg Lys Leu Leu Glu Lys Leu Gly 

660 665 

TGC GAT GTA ACC GCG GTT TCC TCT GGA TTC GAT TGC CTT ACC 
Cys Asp Val Thr Ala Val Ser Ser Gly Phe Asp Cys Leu Thr 
670 675 680 

GCC ATT GCT CCC GGC TCG TCC TCG CCT TCT ACT TCG TTT CAA 
Ala He Ala Pro Gly Ser Ser Ser Pro Ser Thr Ser Phe Gin 
685 690 695 

GTG GTG GTG CTT GAT CTT CAA ATG GCA GAG ATG GAC GGT TAT 
vil vll val Leu Asp Leu Gin Met Ala Glu Met Asp Glv Tyr 
700 705 710 

GAA GTG GCC ATG AGG ATC AGG AGT CGA TCT TGG CCG TTG ATT 
Glu Val Ala Met Arg He Arg Ser Arg Ser Trp Pro Leu lie 

GTG GCG ACG ACA GTG AGC TTG GAT GAA GAA ATG TGG GAC AAG 
Val Ala Thr Thr Val Ser Leu Asp Glu Glu Met Trp Asp Lys 

730 735 

TGT GCA CAG ATT GGA ATC AAT GGA GTT GTG AGA AAG CCA GTG 
cys Ala Gin lie Gl? He Asn Gly Val Val Arg Lys Pro Val 
740 745 

GTG TTA AGA GCT ATG GAG AGT GAG CTC CGA AGA GTA TTG TTG 
Val lln A?g Ala Met Glu Ser Glu Leu Arg Arg Val Leu Leu 
755 760 

CAA GCT GAC CAA CTT CTC TAAGTTGTTA TCTCAACTTC TCTTCTACAT 
Gin Ala Asp Gin Leu Leu 
770 

TCAAAATTTT TACACCATAG ATTTATGTCA AATATATCAA AATGAAATTT 
CGAAATTGTT ATTATATATA CCACCCATAT CTCTATGATT TGTACATCCT 
Q,j..j.^.r^.rTTT GTTCTTTTTC TCATTTTGAA CCCCACGAAA TTGCATTGAA 
TCTTAGTATT TCGTAGGGTC AAGAAGGAGT CAGTTTCGTA GTTTTTTGTT 
TTCTTTATGT TACGAACTTA CGAAACTGAA TATGGCATTA TAGAGTTTT 
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ATG GTT AAA GAA ATA GCT TCT TGG TTA TTG ATA CTA TCA ATG 
Met Val i.ys Glu He Ala Ser Trp Leu Leu He Leu Ser Met 
■'^ 5 10 

S=f v!? dF ?.^7 ^^"^ ^"^T "^"^A GCT ATA AAC GGC GGT GGT 

Val Val Phe Val Ser Pro Val Leu Ala He Asn Gly Glv Glv 
15 20 25 

TAT CCA CGA TGT AAC TGC GAA GAC GAA GGA AAC AGT TTC TGG 12 S 
Tyr Pro Arg Cys Asn Cys Glu Asp Glu Gly Asn Ser Phe Trp 
■ -^y 35 40 

cf^ ^G ^ ACT CAA AGA GTA AGC GAT TTC 168 

Ser Thr Glu Asn He Leu Glu Thr Gin Arg Val Ser Asp Phe 
''S 50 55 

TTA ATC GCA GTA GCT TAT TTC TCA ATC CCT ATT GAG TTA CTT 210 
Leu He Ala Val Ala Tyr Phe Ser He Pro He Glu Leu Leu 
60 65 70 

?vr v^? ^^"^ G'^'^ TTC AAA TGG GTT CTC 252 

Tyr Phe Val Ser Cys Ser Asn Val Pro Phe Lys Trp Val Leu 

75 80 

TTT GAG TTT ATC GCC TTC ATT GTT CTT TGT GGT ATG ACT CAT 2<iA 
85 ^■'■^ "^ys Gl^ Met Thr His 

CTT CTT CAT GGT TGG ACT TAG TCT GCT CAT CCA TTT AGA TTA 336 
Leu Leu His Gly Trp Thr Tvr Ser Ala His Pro Phe Ar? Leu 
100 105 110 

mI^ 5*^^ ^9*^ ^'^'T TTC AAG ATG TTG ACT GCT TTA GTC 378 

Met Met Ala Phe Thr Val Phe Lvs Met Leu Thr Ala LeS Val 
115 120 125 

TCT TGT GCT ACT GCG ATT ACG CTT ATT ACT TTG ATT CCT CTG 420 
Ser Cys Ala Thr Ala He Thr Leu He Thr Leu He Pro Leu 
130 135 140 

CTT TTG AAA GTT AAA GTT AGA GAG TTT ATG CTT AAG AAG AAA 462 
Leu Leu Lys Val Lj| Val Arg Glu Phe M|t Leu Lys Lys Lys 

GCT CAT GAG CTT GGT CGT GAA GTT GGT TTG ATT TTG ATT AAG 504 
Ala His Glu Leu Gly Arg Glu Val Gly Leu He Leu He Lvs 
155 160 165 

^ TTT CAT GTT CGT ATG CTT ACT CAA GAG ATT 546 

Lys Glu Thr Gly Phe His Val Arg Met Leu Thr Gin Glu He 
I'O 175 ISO 

CGT AAG TCT TTG GAT CGT CAT ACG ATT CTT TAT ACT ACT TTG 588 
Arg Lys Ser Leu Asp Arg His Thr He Leu Tyr Thr Thr Leu 
185 190 195 

GTT GAG CTT TCG AAG ACT TTA GGG TTG CAG AAT TGT GCG GTT 630 
Val Glu Leu Ser Lys Thr Leu Gly Leu Gin Asn Cys Ala Val 
200 205 210 

TGG ATG CCG AAT GAC GGT GGA ACG GAG ATG GAT TTG ACT CAT 672 
Trp Met Pro Asn Asp Gly Gly Thr Glu Met Asp Leu Thr His 

215 220 
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GAG TTG AGA GGG AGA GGT GGT TAT GGT GGT TGT TCT GTT TCT 714 
Glu Leu Arg Gly Arg Glv Gly Tyr Gly Gly Cys Ser Val Ser 
225 230 ^^=> 

ATG GAG GAT TTG GAT GTT GTT AGG ATT AGG GAG AGT GAT GAA 756 
Met Glu Asp Leu Asp Val Val Arg He Arg Glu Ser Asp Glu 
240 245 ^•^'^ 

GTG AAT GTG TTG AGT GTT GAC TCG TCC ATT GGT CGA GCT AGT 798 

840 



GTG AAT GTG TTG AGT GTT bAt; ^i- ^ tTri aTa V^r 

Val Asn Val Leu Ser Val Asp Ser Ser He Ala Arg Ala Ser 
255 260 255 

GGT GGT GGT GGG GAT GTT AGT GAG ATT GGT GCC GTG GCT GCT 
Gly Gly Gly Glv Asp Val Ser Glu lie Gly Ala Val Ala Ala 



ATG AAA CCG TTT AGT CTG CAT CGT ACU, Ai^ ^ ^ 

Met Lys Pro Phe Ser Leu His Arg Thr He His Glu Ala Aia 

440 
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^IJ oc. V...- — - 280 

-__ -J.-- pp- nmr: PTT CGT GTT TCG GAT TTT AAT GGA GAG 

III ^ SI? ?ro Se? lH Val Ser Asp Phe Asn Gly Glu 
^ 285 29Q 

CTA AGT TAT GCG ATA CTT GTT TGT GTT TTA CCG GGC GGG ACC 924 

Leu Ser Tyr Ala He Leu Val Cys Val Leu Pro Gly Gly xnr 
295 300 

CGT CGG GAT TGG ACT TAT CAG GAG ATT GAG ATT GTT AAA GTT 966 

Arg Arg Asp Trp Thr Tyr Gin Glu He Glu He Val Lys Val 

GTG GCG GAT CAA GTA ACC GTT GCG TTA GAT CAT GCA GCG GTT 
Val Ala a¥p Gin Val Thr Val Ala Leu Asp His Ala Ala Val 
325 330 -^-^-^ 

CTT GAA GAG TCT CAG CTT ATG AGG GAG AAG CTG GCG GAA CAG 
Leu Glu Glu Ser Gin Leu Met Arg Glu Lys Leu Ala Glu Gin 
340 345 J-'" 

AAC AGG GCG TTG CAG ATG GCG AAG AGA GAC GCG TTG AGA GCG 1092 
Asn Arg Ala Leu Gin Met Ala Lys Arg Asp Ala Leu Arg Ala 

355 360 

AGC CAA GCG AGG AAT GCG TTT CAG AAA ACG ATG AGC GAA GGG 1134 
ser Gin Ala Arg Asn Ala Phe Gin Lys Thr Met Ser Glu Gly 
365 370 375 

ATG AGG CGT CCT ATG CAT TCG ATA CTC GGT CTT TTG TCG ATG 1176 
Met Arg Arg Pro Met His Ser He Leu Gly Leu L|u ser Met 



1008 



1050 



ATT CAG GAC GAG AAG TTG AGT GAC GAG CAG AAA ATG ATT GTT 1218 
He Gin A|| Glu Lys Leu Ser Asg Glu Gin Lys Met ll| vai 

GAT ACG ATG GTT AAA ACA GGG AAT GTT ATG TCG AAT TTG GTG 1260 
Sp Th? mH val Thr Gly Asn Val Met Ser Asn Leu Val 

„P c-aT GTG CCT GAC GGT AGA TTT GGT ACG GAG 1302 

Gly Sp III M^t Sp vll Pro Sp Gly Arg Phe Gly Thr Glu 

4 2 o 

ATG AAA CCG TTT AGT CTG CAT CGT ACG ATC CAT GAA GCA GCT 1344 

Met 
435 
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r?Z mI^ b?^ "^T^ CTA TGC AAT GGA ATT AGG TTC l^ftfi 

Cys Met Ala Arg Cys Leu Cys Leu Cys Asn Gly lie A?g Phi 

455 460 

TTG GTT GAG GCG GAG AAG TCT CTA CCT GAT AAT GTA GTA GHT i aoq 
Leu Val Asp Ala Glu Lys Ser Leu Pro Asd §al vll Glv 

470 475 

'HI ^ ^"^9 ^"^"^ GTG ATA CTT CAT ATG GTT GGT 14 70 

Asp Glu Arg Arj Val Phe Gin Val Il| Leu His Met Val Gl^ 

AGT TTA GTA AAG CCT AGA AAA CGT CAA GAA GGA TCT TCA TTr i -5 
Ser Leu Val Lys Pro Arg Lys Arg Gin Glu Gl? ler lH ^^^^ 

Me? Phe S=T ^ GAA AGA GGA AGC TTG GAT AGG AGT 1554 

Met Phe Lys Val Leu L^s Glu Arg Gly Ser Leu Asp Arg Ser 

GAT CAT AGA TGG GCT GCT TGG AGA TCA CCG GCT TCT TCA GCA TiQfi 
Asp His Arg Trp Ala Ala Trp Arg Ser Pro Ala Ser Ser Al¥ 

525 530 

H"^? ATA AGA TTT GAA ATG AAT GTA GAG AAT Ifi-^fi 

Asp Gly Asp Val Tyr He Arg Phe Glu Met Asn Val Glu Asn 

540 . 545 

GAT GAT TCA AGT TCT CAA TCA TTT GCT TCT GTT TCC TCC AGA 1 aan 
Asp Asp Ser Ser Ser Gin Ser Phe Ala Ser vil Ser Ser ^ 

555 566 

GAT CAA GAA GTT GGT GAT GTT AGA TTC TCC GGC GGC TAT GGCi 179 9 
Asp Gin Glu val Gl^ Asp Val Arg Phe Ser Gly Gl? Ty? Gly 

TTA GGA CAA GAT CTA AGC TTT GGT GTT TGT AAG AAA am rrr t7ca 
Leu Gly Gin Asp Leu Ser Phe Gly Val Cys lOS 1^ vll vll ^^^^ 

ooU 585 

CAG TTG ATT CAT GGG AAT ATC TCG GTG GTC CCT GGC TCG GAT 1806 
Gin L|u He His Gly Asn Il| Ser Val Val Pro Gl^ Se? Sp ^ 

GGT TCA CCG GAG ACC ATG TCG TTG CTC CTT CGG TTT CGA CGT lfl4R 
Gly ser Pro Glu Thr Met Ser Leu Leu Leu Arg Phe A?^ Arg ^ 

610 615 ^ 

AGA CCC TCC ATA TCT GTC CAT GGA TCC AGC GAG TCG CCA GCT i ft on 
Arg Pro Ser lie Ser Val His Gly Ser Ser GliI str pS All ° 
o20 625 630 

CCT GAC CAC CAC GCT CAC CCA CAT TCG AAT TCT CTG TTA rrT i q-j-j 
Pro Asp His His Ala His Pro His Ser Se? LeS lIu pSI 

635 640 ^ 

GGC TTA CAA GTT TTA TTG GTA GAC ACC AAC GAT TCG AAC CGG 1 did 
Gl^ Leu Gin Val Leu L|u Val Asp Thr Asn A|| Se? A^S A?g ^^^^ 

tIH ^^"^ ^ ^"^C TTA GAG AAA CTC GGG TGC GAT GTA 2016 

Ala Val Thr Arg Lys Leu Leu Glu Lys Leu Gly Cys Asp Val 
o60 665 670 
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ACC GCG GTT TCC TCT GGA TTC GAT TGC CTT ACC GCC ATT GCT 
Thr Ala Val Ser Ser Gly Phe Asp Cys Leu Thr Aia lie Aia 



675 



2058 



2100 



CCC GGC TCG TCC TCG CCT TCT ACT TCG TTT CAA GTG GTG GTG 
Pro Gly Ser Ser Ser Pro Ser Thr Ser Phe Gin Val Val Val 
690 'uu 

pan ATG GCA GAG ATG GAC GGT TAT GAA GTG GCC 2142 

III lZ ilS mI? Alt GlS Met ASP Glv Tyr Glu Val Ala 

• 705 

ATG AGG ATC AGG AGT CGA TCT TGG CCG TTG ATT GTG GCG ACG 2184 

Met Arg He Arg Ser Arg Ser Trp Pro Leu lie Val Ala Thr 

ACA GTG AGC TTG GAT GAA GAA ATG TGG GAC AAG TGT GCA CAG 2226 
Thr Val Ser Leu Asp Glu Glu Met Trp Asp Lys Cys Aia Gin 
730 735 

ATT GGA ATC AAT GGA GTT GTG AGA AAG CCA GTG GTG TTA AGA 2268 
111 lie il? val Val Arg Lys Pro Val Val L|u Arg 

GCT ATG GAG AGT GAG CTC CGA AGA GTA TTG TTG CAA GCT GAC 2310 
Ala Met Glu Ser Glu Leu Arg Arg Val Leu Leu Gin Ala Asp 

760 . 
CAA CTT CTC TAAGTTGTTA TCTCAACTTC TCTTCTACAT TCAAAATTTT 2259 
Gin Leu Leu 

TACACCATAG ATTTATGTCA AATATATCAA AATGAAATTT CGAAA 2404 
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iTi rn rri rr^ rr^ ry^ 

i i i i I XTTTT 


GTCAAAAGCT 


CGATGTAAAA. 


ATCCGATGGC 


CACAAGCAAA 


50 




CCAACTTCAC 


GGAGATTGTG 


AAAATGGAGT 


AGTAGTTCAG 


100 


TGAAGTAGTA 


GATACTGAGA 


TCGCATTCTC 


CGGCGTCGTT 


TTTCACATCG 


150 


AAATAGTCGT 


GTAAAAAAAT 


GAAAAAATTG 


CTGCGAGACA 


GGTATGTGTC 


200 


GCAGCAGGAA 


ATAGCATCTT 


AAAGGAAGGA 


AGGAAGGAAA 


CTCGAAAGTT 


250 


ACTAAAAATT 


TTTGATTCTT 


TGGGACGAAA 


CGAGATA ATG 
Met 
1 


GAA TCC 
Glu Ser 


296 



TGT GAT TGC ATT GAG GCT TTA CTG CCA ACT GGT GAC CTG CTG 338 
Cys Asp Cys He Glu Ala Leu Leu Pro Thr Gly Asp Leu Leu 

^ 10 15 



380 



GTT AAA TAG CAA TAC CTC TCA GAT T,TC TTC ATT GCT GTA GCC 
Val Lys Tvr Gin Tyr Leu Ser Asp Phe Phe He Ala Val Ala 
20 25 30 

TAC TTT TCC ATT CCG TTG GAG CTT ATT TAT TTT GTC CAC AAA 4?? 
Tyr Phe Ser lie Pro Leu Glu Leu He Tyr Phe Val His Lys 
35 40 

?H Sf^ l^^ GTC CTC ATG CAA TTT GGT 464 

Ser Ala Cys Phe Pro Tyr Arg Trp Val L|u Met Gin Phe Gly 

GCT TTT ATT GTG CTC TGT GGA GCA ACA CAC TTT ATT AGC TTG 506 
Ala Phe He Val Leu C^| Gly Ala Thr His Phe He Ser Leu - 

1^^ mE^ "^TT ATG CAC TCT AAG ACG GTC GCT GTG GTT ATG 54 fi 

Trp Thr Phe Phe Met His Ser Lys Thr Val Ala Val Val Met 

80 85 

ThS tT^ ^ ^TG TTG ACA GCT GCC GTG TCC TGT ATC ACA 590 

Thr He Ser Lys Met Leu Thr Ala Ala Val Ser Cys He Thr 
90 95 100 

GCT TTG ATG CTT GTT CAC ATT ATT CCT GAT TTG CTA AGT GTT 632 
Ala Leu Met Leu Val His He He Pro Asp Leu Leu Ser Val 
105 110 115 

^ 9^ 9^^ TTG TTC TTG AAA ACT CGA GCT GAA GAG CTT 674 

Lys Thr Arg Glu Leu Phe Leu Lys Thr Arg Ala Glu Glu Leu 

120 125 

GAC AAG GAA ATG GGC CTA ATA ATA AGA CAA GAA GAA ACT GGC 716 
ASD Lys Glu Met Gly Leu He He Arg Gin Glu Glu Thr Gly 
130 135 140 ' 

AGA CAT GTC AGG ATG CTG ACT CAT GAG ATA AGA AGC ACA CTC 758 
Arg His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu 
l^S 150 155 

GAC AGA CAC ACA ATC TTG AAG ACT ACT CTT GTG GAG CTA GGT 
Asp Arg His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly 
160 165 170 
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AGG ACC TTA GAG CTG GCA GAA TGT GCT TTG TGG ATG CCA TGC 
Arg Thr Leu As^ Leu Ala Glu Cys Ala Leu Trp Met Pro 



ATC AAT GAA ATT TTT AGT AGG CCT GAA GCA ATA CAA ATT CCA 
lie Asn Glu He Phe Ser Ser Pro Glu Ala He Gin lie fro 
215 220 



842 



884 



CAA GGA GGC CTG ACT TTG CAA CTT TCC CAT AAT TTA AAC AAT 
Gin Gly Gly Leu Thr Leu Gin Leu Ser His Asn Leu Asn Asn 

190 195 

CTA ATA CCT CTG GGA TCT ACT GTG CCA ATT AAT CTT CCT ATT 926 
Leu He Pro Leu Gly Ser Thr Val Pro lie Asn Leu fro iie 
200 205 



968 



CAT ACA AAT CCT TTG GCA AGG ATG AGG AAT ACT GTT GGT AGA 1010 
His Th^ ?ro lln Ala Arg Met Arg Asn Thr Val Glj^ Arg 
230 235 ^"^^ 

TAT ATT CCA CCA GAA GTA GTT GCT GTT CGT GTA CCG CTT TTA 1052 
Ty^ lie p5v Pro 5lS Val Val Ala Val Arg Val Pro Leu Leu 



CAC CTC TCA AAT TTT ACT AAT GAC TGG GCT GAA CTG TCT ACT 1094 
His Leu Ser Asn Phe Thr Asn Asp Trp Ala Glu Leu ser inr 

260 

AGA AGT TAT GCG GTT ATG GTT CTG GTT CTC CCG ATG AAT GGC 1136 
Arg si? Tyr All Val Met Val Leu Val Leu Pro Met Asn Gly 

TTA AGA AAG TGG CGT GAA CAT GAG TTA GAA CTT GTG CAA GTT 1178 
Leu Arg Lys Trp Arg Glu His Glu Leu Glu Leu Val Gin vax 
285 290 

GTC GCA GAT CAG GTT GCT GTC GCT CTT TCA CAT GCT GCA ATT 1220 
Val Ala Asp Gin Val Ala Val Ala Leu Ser His Ala Aia iie 
300 305 -^-L" 

TTA GAA GAT TCC ATG CGA GCC CAT GAT CAG CTC ATG GAA CAG 1262 
"u GlS Sp ser ult A?3 Ala His Asg Gin Leu Met Glu Gin 

AAT ATT GCT TTG GAT GTA GCT CGA CAA GAA GCA GAG ATG GCC 1304 
111 Ala Hi Asg vil Ala Arg Gin Glu Ala Glu Met Ala 

ATC CGT GCA CGT AAC GAC TTC CTT GCT GTG ATG AAC CAT GAA 1346 
He Arg Ala Arg Asn Asp Phe Leu Ala Val Met Asn tiis ^lu 
340 345 -^^^ 

ATG AGA ACG CCC ATG CAT GCA GTT ATT GCT CTG TGC TCT CTG 1388 
Met Arg Thr Pro Met His Ala Val He Ala Leu cys ber i.eu 
355 360 

TTT TTA GAA ACA GAC TTA ACT CCA GAG CAG AGA GTT ATG ATT 1430 
III llu GlS Th^ A^p leu Thr Pro Glu Gin Arg Val M|t He 
370 375 

GAG ACC ATA TTG AAG AGC AGC AAT CTT CTT GCA ACA CTG ATA 1472 
Glu Thr He Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu H| 
385 390 -^^^ 
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AAT GAT GTT CTA GAT CTT TCT AGA CTT GAA GAT GGT ATT CTT 1514 
Asn Asp Val Leu Asp Leu Ser Arg Leu Glu Asp Gly lie Leu 

400 405 

GAA CTA GAA AAC GGA ACA TTC AAT CTT CAT GGC ATC TTA AGA 1556 

Glu Leu Glu Asn Gly Thr Phe Asn Leu His Gly He Leu Ara 

410 415 420 

GAG GCC GTT AAT TTG ATA AAG CCA ATT GCA TCT TTG AAG AAA 1598 

Glu Ala Val Asn Leu He Lys Pro He Ala Ser Leu Lys Lvs 
425 430 435 

TTA TCT ATA ACT CTT GCT TTG GCT CTG GAT TTA CCT ATT CTT 1640 
Leu Ser lie Thr Leu Ala Leu Ala Leu Asp Leu Pro He Leu 
440 445 450 

GCT GTG GGT GAT GCA AAA CGT CTT ATC CAA ACT CTC TTA AAC 1682 
Ala Val Gl.y Asp Ala Lys Arg Leu He Gin Thr Leu Leu Asn 
455 460 465 

GTG GTG GGA AAT GCT GTG AAG TTC ACT AAA GAA GGA CAT ATT 1724 
Val Val Gly Asn Ala Val Lys Phe Thr Lys Glu Gly His He 

470 475 

TCA ATT GAG GCT TCA GTT GCC AAA CCA GAG TAT GCG AGA GAT 1766 
Ser He Glu Ala Ser Val Ala Lys Pro Glu Tyr Ala Arg Asp 
480 485 490 

TGT CAT CCT CCT GAA ATG TTC CCT ATG CCA AGT GAT GGC CAG 1808 
Cys His Pro Pro Glu Met Phe Pro Met Pro Ser Asp Gly Gin 
495 500 505 

TTT TAT TTG CGT GTC CAG GTT AGA GAT ACT GGG TGT GGA ATT 1850 
Phe Tyr Leu Arg Val Gin Val Arg Asp Thr Gly Cys Gly He 
510 515 520 

AGC CCA CAA GAT ATA CCA CTA GTA TTC ACC AAA TTT GCA GAG 1892 
Ser Pro Gin Asp He Pro Leu Val Phe Thr Lys Phe Ala Glu 
525 530 535 

TCA CGG CCT ACG TCA AAT CGA AGT ACT GGA GGG GAA GGT CTA 1934 
Ser Arg Pro Thr. Ser Asn Arg Ser Thr Gly Gly Glu Gly Leu 

540 545 

GGG CTT GCC ATT TGG AGA CGA TTT ATT CAA CTT ATG AAA GGT 1976 
Gl^ Leu Ala He Trp Arg Arg Phe He Gin Leu Met Lys Gly 

AAC ATT TGG ATT GAG AGT GAG GGC CCT GGA AAG GGA ACC ACT 2018 
Asn He Trp He Glu Ser Glu Gly Pro Gly Lys Gly Thr Thr 
565 570 575 

GTC ACG TTT GTA GTG AAA CTC GGA ATC TGT CAC CAT CCA AAT 2060 
Val Thr Phe Val Val Lys Leu Gly He Cys His His Pro Asn 
.580 585 590 

GCA TTA CCT CTG CTA CCT ATG CCT CCC AGA GGC AGA TTG AAC 2102 
Ala Leu Pro Leu Leu Pro Met Pro Pro Arg Gly Arg Leu Asn 
595 600 ^ ^ 

AAA GGT AGC GAT GAT CTC TTC AGG TAT AGA CAG TTC CGT GGA 2144 
Lys Gly Ser Asp Asp Leu Phe Arg Tyr Arg Gin Phe Arg Gly 

610 615 
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GAT GAT GGT GGG ATG TCT GTG AAT GCT CAA CGC TAT CAA AGA 2186 
AsD Asp Gly Gly Met Ser Val Asn Ala Gin Arg Tyr Gin Arg 
62t) 625 630 

AGT ATG TAA A TGACAAAAGG ACATTGGTGT GACAAAGAAC 2226 
Ser Met * 
635 

ATTAAATCAT GACTAGTGAA TTTGAGATTT CTTCACTGTT CTGTACACTC 2276 
CAAATGGCAC AGTTTGTCTT GTAACTAACC TAATTCAATG CTCGTAAAGT 2326 
GAGTACTGGA GTATCTTGAA AATGTAACTA TCGAATTTAT ACATCGAGCT 2376 
TTTGACAAAA AAAAAAAAAA AAAAAAAAA 2405 
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AGATCTGGTA CTACCAAAAG GTATCCAATT AATCCATGCT TGGCCTCCCA 50 

TTACAATGCC TGTAAGAAAT AATTGTTCTT TCCACCTCCA CAACTAATTG 100 

TCGAACTATT ATATCTATCT TTATTCCCTT AAATGTGAAA CGAATTACAC 150 

AGACTATTTG GCGCTACTTT TTTCCTAGAT ATATTGAAGA CCTAGTTTCT 200 

TATATTTGTG GGAAGCATTT GGAAGTTCTA TAAGAACTAT ATCATGTTCG 250 

AAAACATTCT TATAATTTTC GACAAGATTG CTGAAGGAGT GTCTTATCTT 300 

TTATGTATTC TTGACTAGAG GAGTTTAATA AAAAGAAAAT AGAAAGGAAC 350 

AAAGAAACGT ACAAGTGTAT AAAAGGAGTT GGGGCAAAGA CATCAGAAAC 400 

ATTTAGACCT ACGATTTCAT CCTACATGTT ATGGTTTTAG TTCGTTAGAG 450 

GTTTTAACAT ATTAAATCAG CAAAGTTGTG ACATACATAA AGTGCATAAC 500 

ATAAAGATGA AATTCACAAT TTGCTGGATC TTTTGGTGCA AGGGAACTAT 550 

TTTTTACACT ATAAGTTAGC TGTTAATTTC AATATTGGCT CTTCTACACC 600 

TTGTTGTTCT TGAGTATAAT TCTATTTTGC ATCAAACATA TGTCAGAACT 650 

TATGCTGCAA TTAAATATAT TCAGGTTGTT TAACTCTTGT ACAGCTTGTT 700 

ATTCTTCTGA GGTCTATTTC CTTCTCCTTA TTTGCTAACT TGTGCTGCAG 750 

TTATCTTCCA TC GTG GAG TCA TGT AAC TGC ATC ATT GAC CCA 792 
Val Glu Ser Cys Asn Cys lie lie Asp Pro 
1 5 10 

CAG TTG CCT GCT GAC GAC TTG CTA ATG AAG TAT CAG TAC ATT 834 
Gin Leu Pro Ala Asg Asp Leu Leu Met L^s Tyr Gin Tyr He 

TCT GAT TTT TTC ATA GCA CTT GCT TAT TTC TCC ATT CCA GTG 876 
Ser Asp Phe Phe He Ala Leu Ala Tyr Phe Ser He Pro Val 
25 30 35 

GAG TTG ATA TAC TTC GTT AAG AAG TCT GCT GTC TTT CCA TAT 918 
Glu Leu He Tyr Phe Val Lys Lys Ser Ala Val Phe Pro Tyr 
40 45 50 



AGA TGG GTT CTT GTG CAG TTC GGT GCT TTC ATA GTT CTT TGT 
Arg Trp Val Leu Val Gin Phe Gl^ Ala Phe He Val Leu Cys 



960 



GGA GCA ACC CAT CTT ATC AAC TTA TGG ACA TTT AAT ATG CAT 1002 
Gly Ala Thr His Leu He Asn Leu Tr^ Thr Phe Asn Met His 



ACA AGG AAT GTG GCA ATA GTA ATG ACT ACT GCA AAG GCC TTG 1044 
Thr Arg Asn Val Ala He Val Met Thr Thr Ala Lys Ala Leu 
^ 85 90 

ACT GCA CTG GTG TCA TGT ATA ACT GCT CTC ATG CTT GTC CAC 1086 
Thr Ala Leu Val Ser Cys He Thr Ala Leu Met Leu Val His 
95 100 ■'■05 
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ATC ATT CCT GAT TTA TTA AGT GTC AAA ACT AGA 
lie lie Pro Asp Leu Leu Ser Val Lys Thr Arq 
110 115 ^ ^ 

TTG AAA AAG AAA GCT GCA CAG CTT GAG CGT GAA 
Leu Lys Lvs Lys Ala Ala Gin Leu Asp Arg Glu 
125 130 

ATT CGG ACT CAG GAG GAG ACA GGT AGA CAT GTT 
He Arg Thr Gin Glu Glu Thr Gly Arg His Val 
140 145 

ACT CAT GAA ATC CGA AGC ACT CTT GAT AGA CAT 
Thr His Glu He Arg Ser Thr Leu Asp Arg His 

155 160 

AAG ACT ACA CTT GTT GAG CTA GGA AGA ACA TTG 
Lys Thr Thr Leu Val Glu Leu Gly Arg Thr Leu 
155 170 ^ 175 

GAG TGT GCA TTA TGG ATG CCA ACA CGT ACT GGA 
Glu Cvs Ala Leu Trp Met Pro Thr Arg Thr Gly 
180 185 

CAG CTT TCT TAG ACT TTA CGA CAC CAA AAT CCA 
Gin Leu Ser Tyr Thr Leu Arg His Gin Asn Pro 
195 200 

ACT GTA CCC ATT CAA CTT CCT GTA ATC AAT CAA 
Thr Val Pro He Gin Leu Pro Val He Asn Gin 
210 215 

ACA AAT CAT GTC GTG AAA ATA TCA CCA AAT TCT 
Thr Asn His Val Val Lys He Ser Pro Asn Ser 

225 230 



GAA CTG 
Glu Leu 
120 

ATG GGT 
Met Gly 
135 

AGA ATG 
Arg Met 



ACT ATT 
Thr He 



TTC 
Phe 



ATT 
He 



CTA 
Leu 
150 

TTA 
Leu 



GCA TTG GAA 
Ala Leu Glu 



CTA GAG CTT 
Leu Glu Leu 
190 

GTT GGA TTA 
Val Gly Leu 
205 

GTT TTC GGT 
Val Phe Gly 
220 

CCT GTC GCA 
Pro Val Ala 



AGA CTT CGA CCT GCT GGG AAA TAC ATG CCT GGT 
Arg Leu Arg Pro Ala Gly Lys Tyr Met Pro Gly 
235 240 245 

GCT GTC AGG GTT CCA CTT CTG CAT CTG TCG AAC 
Ala Val Arg Val Pro Leu Leu His Leu Ser Asn 
250 255 

AAT GAT TGG CCT GAA CTT TCA ACA AAG CGC TAT 
Asn Asp Trp Pro Glu Leu Ser Thr Lys Arg Tyr 
265 270 

GTT CTG ATG CTT CCT TCA GAC AGT GCA AGA CAA 
Val Leu Met Leu Pro Ser Asp Ser Ala Arg Gin 
280 285 

CAT GAG CTG GAG CTT GTT GAA GTG GTA GCT GAT 
His Glu Leu Glu Leu Val Glu Val Val Ala Asp 

295 300 

TGATTTTTGT TATTGAAAAT TCCTTAATAT AATGTTAAAA 

ATATATTTTT GGGTTGAACA CAACCACGTT GACATACTGA 

TAAAATTAGA CATGGAGAAG ACCAATTACA AAAATCTGAG 

CAGAATCACA AGGCTTAGTT GTTCTTAGTA TTATGGTTTT 



GAG GTG GTT 
Glu Val Val 



TTT CAG ATT 
Phe Gin He 
260 

GCT TTA ATG 
Ala Leu Met 
275 

TGG CAT GTT 
Trp His Val 
290 

CAG GTT 
Gin Val 



TTTCTCTTTT 
GTTCTGGGTG 
AATCTGCTAG 
ATCCATTGGA 



1128 

1170 

1212 

1254 

1296 

1338 

1380 

1422 

1464 

1506 

'1548 

1590 

1632 

1671 

1721 
1771 
1821 . 
1871 
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ATTGCACAGC AGAATTGTTA TTACTGTTAT TTTTTTTTAA AATTTTCAAA 1921 

GATAAATCAA AAGCTGAACT ATATGACTTT TTGCATACTT CGTCTGCTGA 1971 

TTGCTTTTTG GTGATGGAAT AGTTAGGCTG GGTTGTGGAT GAGTATATCA 2021 

TAGTAGATTT TCTGATAGGA TCTTAACTCC TTGGCTTTTG TTTTCTATAG 2071 

ATGATCCCTT GTATTAGAAG CACGGGAAAT AGGATCGATG GTATATAGAA 2121 

ATATTAGGAA CAGCTTTCTG AATCATTTGA ATATTCCTTT TATGGAACAT 2171 

AGAACTCTTG ACGTGTATGT AGTTTTCTTA GTACTTTTAT CATATGAAGT 2221 

GAAAATAACG TTTTGCGATA ATGTATTTGA GTGTGTAAAA TTAAATACTA 2271 

CTGAGTTTTA CAAAAATAAT TCTTCAACGG AAGCCATTTA TTTTTTTTAC 2321 

ATATCTGGCA TCTTACTTCT CCATCAAAGA CTTTAGAGAA CTTTAACTTT 2371 

TTCATTCTGT CTCTCGTAGT GTACTGTTCT CTGATGTATG TAATTAGCTC 2421 

ACTGGCAAGT AGCACACCTA GTCTTTGTTT GACTTGTTTA AAAATCATGA 2471 

TGTATCATCA GTTACGGTGA AGTGTCCAAG TTTTACTGCT TTTTGCTATT 2521 

TGCATTGCAG AGTCTTAAAA CATTTCAGTT ATTCCTGGAT TTCTCCTGTT 2571 

TATCAATGGA AAATTCAACT ATCAACTATG CCTCAATCAA TAAATGAAAC 2621 

CTCTATATCT AACCACTCCA ACTCAGATCC AGAAATCAGA TTTCAAAGAA 2671 

ATTCATCATA ACTCAACTAT AGGATTGCTG TTAACCAAGA GTAATCCTCA 2721 

TTTGTCCAGA CAGGCGACCA GCTATTATGC TTTCATTATG GGAAAAATTG 2771 

ACAATTAATT AAAGGAAGGA ACAACTGAAG AAAAGACATC CTTGTCAGCT 2821 

TCCTCTCCCA ACCCTTGCCT GAATAAGACA AAAAGTTTCT TGGAGAAAAC 2871 

TCTGAATATT GGTATCCACC TCCTTTCTCC TAATTTAGGA TGCTCTATTT 2921 

CTAGACATAT AGGGGAATAC TCTATTCTAG TGGTCGGTGT CTGGTTGCAA 2971 

CTAGTTTTAG ATGTTTATAT GTCTTATTTG ATTTAATAAG AGCTATCCTT 3021 ^ 

GAGTGCCCAA TGTGATTTAA TCTACGCTTC GGCATTTCAG GTT GCT GTT 3070 

Val Ala Vai 
305 

GCT CTT TCA CAT GCT GCT ATA TTA GAA GAA TCA ATG AGG GCT 3112 
Ala Leu Ser His Ala Ala He Leu Glu Glu Ser Met Arg Ala 
310 315 320 

AGG GAT CTT CTT ATG GAG CAG AAT GTG GCT CTT GAT CTG GCA 3154 
Arg Asp Leu Leu Met Glu Gin Asn Val Ala Leu Asp Leu Ala 
325 330 

AGA AGA GAA GCA GAA ATG GCT GTT CGT GCA CGT AAT GAT TTC 3196 
Arg Arg Glu Ala Glu Met Ala Val Arg Ala Arg Asn Asp Phe 
335 340 345 
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TTG GCT GTT ATG AAT CAT GAA ATG AGA ACT CCC ATG CAT GCA 3238 
i.eu Ala Val Met Asn His Glu Met Arg Thr Pro Met His Ala 
350 355 

ATA ATT GCA CTT TCT TCC TTA CTA CAA GAA ATC GAT CTA ACT 3280 
He He Ala Leu Ser Ser Leu Leu Gin Glu He Asp Leu Thr 
365 370 375 

CCA GAG CAA CGT CTG ATG GTT GAA ACA ATC CTC AAA AGC AGC 3322 
Pro Glu Gin Arg Leu Met Val Glu Thr He Leu Lys Ser Ser 
380 385 390 

AAC CTT TTA GCA ACG CTC ATC AAC GAT GTC TTG GAT CTT TCA 3364 
Asn Leu Leu Ala Thr Leu He Asn Asp Val Leu Asp Leu Ser 

395 400 

AGG CTA GAG GAT GGA AGT CTT CAA CTT GAT ATT GGC ACT TTC 3406 
Arg Leu Glu Asp Gly Ser Leu Gin Leu Asp He Gly Thr Phe 
405 410 415 

AAT CTC CAT GCT TTA TTT AGA GAG GTG CCCTTCATCA CCCTCTTTTC 3453 
Asn Leu His Ala Leu Phe Arg Glu Val iv^iinu j'loj 

420 425 

TTTTTTACTT GCAAATTCTA GATTACCTGT CAGAAAAAAA GTGTCATTAC 3503 

AGATATTTTG CACTTCAATA TGTTTGCTGG ACCTGCTGAC TGATATATGT 3553 

GTCTGCTTAT TCCTGTAG GTC CAT AGC TTA ATC AAG CCT ATT GCA - 3598 

Val His Ser Leu He Lys Pro He Ala 
430 435 

TCT GTG AAA AAG TCT GTT GCT CAA-CTT AGT TTG TCG TCA GAT -3640 
Ser Val Lys Lys Ser Val Ala Gin Leu Ser Leu Ser Ser Asp 
440 445 450 

TTG CCG GAA TAT GTA ATT GGG GAT GAA AAA CGG TTA ATG CAA 3682 
Leu Pro Glu Tyr Val He Gly Asp Glu Lys Arg Leu Met Gin 

455 460 

ATT CTC TTA AAC GTT GTT GGC AAT GCT GTA AAG TTC TCA AAG -3724 
He Leu Leu Asn Val Val Gly Asn Ala Val Lys Phe Ser Lvs 
465 470 475 

GAA GGC AAC GTA TCA ATC TCC GCT TTT GTT GCA AAA TCA GAC 3766 
Glu Gly Asn Val Ser He Ser Ala Phe Val Ala Lys Ser Asp 
480 485 490 

TCT TTA AGA GAT CCT AGA GCC CCT GAA TTT TTT GCT GTG CCT 3808 
Ser Leu Arg Asp Pro Arg Ala Pro Glu Phe Phe Ala Val Pro 
495 500 505 

AGT GAA AAT CAC TTC TAT TTA CGG GTG CAG 3838 
Ser Glu Asn His Phe Tyr Leu Arg Val Gin 
510 515 

GTATATTTTT ACAAGCTTGA TATACTATCT TCGTAGGTTA AGGATAGTCA 3888 

CAAATATGAT ATTTTAGACT TATAACTGTC AGATGTTCTG TTCTTGATAT 3938 

TTGTAATATT CTAAGTAATA CTTTCTGTAG 3968 
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ATA AAA GAT ACG GGG ATA GGA ATT ACA CCA CAG GAT ATT CCC 4010 

He Lys Asp Thr Gly He Gly lie Thr Pro Gin Asp He Pro 

520 525 530 

AAC CTG TTT AGC AAG TTT ACA CAA AGC CAA GCG CTA GCA ACT 4052 
Asn Leu Phe Ser Lys Phe Thr Gin Ser Gin Ala Leu Ala Thr 

535 540 

ACA AAT TCT GGT GGC ACT GGG CTT GGT CTT GCA ATT TGT AAG 4094 
Thr Asn Ser Gly Gly Thr Gly Leu Gly Leu Ala He Cys Lys 
545 550 555 

AG GTACGGGTAC CAGTTCCTTA GTGTTCTTTT TCCGACTCTG 4136 
Arg 

ATTTTCATTC TACGTGAACT TGGTAACTGC TTCATATTCA ATTTCTTTCT 4186 

CTTACTGTAT TTACGTATTG ACACATCTCC TGATGGGACA CAAAAAG G 4234 

TTT GTG AAT CTT ATG GAA GGA CAT ATT TGG .ATT GAA AGT GAA 4276 
Phe Val Asn Leu Met Glu Gly His He Trp He Glu Ser Glu 
560 565 570 

GGT CTT GGC AAG GGG TCT ACT GCT ATA TTT ATC ATT AAA CTT 4318 
Gly Leu Gly Lys Gly Ser Thr Ala He Phe He He Lys Leu 
575 580 585 

GGA CTT CCT GGA CGT GCA AAT GAA TCT AAG CTG CCC TTT GTG 4360 
Gly Leu Pro Gly Arg Ala Asn Glu Ser Lys Leu Pro Phe Val 
^ 590 595 600 

ACC AAA TTG CCA GCA AAT CAC ACG CAG ATG AGT TTT AAG GAT 4402 
Thr Lys Leu Pro Ala Asn His Thr Gin Met Ser Phe Lys Asp 
^ 605 610 615 

TAAAGGTTTT GGTGATGGAT GAGAATGGGT GAGTACTATC TGGACCCCTT 4452 

TATCCTCGAC TCTTGTCTTG CCATGCTGTT TAATGATCCA TCTGATTGCG 4502 

TGATTTCTCA TCTTATATGT ATTGAGCTGT CTTACTCACT TTACATGAGA 4552 

CTACAGTAAT ACTT ^^^^ 
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iUVGATAAGAG TGATTCATTA AGGAGTTTGT TC ATC ATG GAT TGT AAC 

lie Met Asp Cys Asn 
1 5 

r?,^ DhE n^"^ S"^ "^"^^ "^CT GCC GAT GAG TTG TTA ATG AAG 

Cys Phe Asp Pro Leu Leu Pro Ala Asp Glu Leu Leu Met Lys 

l^l r^^ ^T*^ ■^'T'T TTC ATT GCA GTT GCT TAT TTT 131 



30 



47 



89 



tT^ ^T^ ^ ^"^^ G'TA TTC TTT GTC CAG AAA TCA GCT 173 

Ser lie Pro He Glu Leu Val Phe Phe Val Gin Lys Ser Ala 

40 45 

GTT TTT CCG TAT CGA TGG GTG CTT GTG CAG TTT GGT GCT TTC 215 
Val Phe Pro Tyr Arg Trp Val Leu Val Gin Phe Gly Ala Phe 
^0 55 60 

ATA GTT CTT TGT GGA GCA ACA CAC CTT ATC AAT TTG TGG ACT 257 
He Val Leu Cys Gly Ala Thr His Leu He Asn Leu Trp Thr 
65 70 ^ 75 

121 ^S*^ S^"^ AGG ACT GTG GCA ATG GTG ATG ACT ACG 299 

Ser Thr Pro His Thr Arg Thr Val Ala Met Val Met Thr Thr 

80 85 

A?? oK^ ^9"^ *^CG GTA TCA TGT GCA ACT GCT GTC 341 

Ala Lys Phe Ser Thr Ala Ala Val Ser Cys Ala Thr Ala Val 

95 100 

ATG CTT GTC GCA ATT ATT CCG GAT TTA TTA AGT GTC AAA ACT 383 
Met Leu Val Ala He He Pro Asp Leu Leu Ser Val Lys Thr 
105 110 115 " 

tSS 11^ AAA AAC AAA GCG GCG GAA CTT GAT CGT 425 

Arg Glu Leu Phe Leu Lys Asn Lys Ala Ala Glu Leu Asp Arg 
■'■20 125 130 ^ 

^ r?*^ ACA CAG GAG GAG ACG GGT AGA TAT 467 

Glu Met Gly Leu He Arg Thr Gin Glu Glu Thr Gly Arg Tvr 
135 140 ^ ^ lil 

GTT AGA ATG CTA ACA CAT GAA ATC AGA AGT ACT CTG GAT AGA 509 
Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp Arg 

IwU X55 

S^I tT*^ F^ AAG ACT ACA CTT GTT GAA CTT GGA AGA GCA 551 

His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg Ala 
■••oO 165 170 

tI?. 9^ 9"^^ ^ ^G'T GCT TTG TGG ATG CCG ACT CGA ACT 593 

Leu Gin Leu Glu Glu Cys Ala Leu Trp Met Pro Thr Arg Thr 
•l-'5 180 185 

GGA GTG GAG CTT CAA CTT TCT TAC ACT TTA CAT CAT CAA AAT 635 
Gly Val Glu Leu Gin Leu Ser Tyr Thr Leu His His Gin Asn 
190 195 200 

?^ 11"^ ^^A GTA CCT ATA CAA CTC CCT GTA ATT AAT 677 

Pro Val Gly Phe Thr Val Pro He Gin Leu Pro Val He Asn 
201 210 215 

FIG. 19A 

SUBSTITUTE SHEET (RULE 26) 



wo 95/01439 PCT/US94/07418 



59 / 65 



CAA GTT TTC AGT GCA AAT TGT GCT GTT AAA ATT TCA CCT 716 
Gin Val Phe Ser Ala Asn Cys Ala Val Lys lie Ser Pro 

220 

TAATCTGCCG TTGCAAGGCT T "^"^"^ 
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TTTTTTTTTT GTCAAAAGCT CGATGTAAAA ATCCGATGGC CACAAGCAAA 50 

ACGACAGGTT CCAACTTCAC GGAGATTGTG AAAATGGAGT AGTAGTTCAG 100 

TGAAGTAGTA GATACTGAGA TCGCATTCTC CGGCGTCGTT TTTCACATCG 150 

AAATAGTCGT GTAAAAAAAT GAAAAAATTG CTGCGAGACA GGTATGTGTC 200 

GCAGCAGGAA ATAGCATCTT AAAGGAAGGA AGGAAGGAAA CTCGAAAGTT 250 

ACTAAAAATT TTTGATTCTT TGGGACGAAA CGAGATA ATG GAA TCC TGT 299 

Met Glu Ser Cys 

GAT TGC ATT GAG GCT TTA CTG CCA ACT GGT GAC CTG CTG GTT 341 
As| Cys He Glu Ala Leu Leu Pro Thr Gly As| Leu Leu Val 

AAA TAG CAA TAC CTC TCA GAT TTC TTC ATT GCT GTA GCC TAG 383 
Lys Tyr Gin Tyr Leu Ser Asp Phe Phe He Ala Val Ala Tvr 
20 25 30 

TTT TCC ATT CTG TTG GAG CTT ATT TAT TTT GTC CAC AAA TCT 425 
Phe Ser lie Leu Leu Glu Leu He Tyr Phe Val His Lvs Ser 
35 40 45 

GCA TGC TTC CCA TAC AGA TGG GTC CTC ATG CAA TTT GGT GCT 467 
Ala Cys Phe Pro Tyr Arg Trp Val Leu Met Gin Phe Gly Ala 
50 55 60 

TTx ATT GTG CTC TGT GGA GCA ACA CAC TTT ATT AGC TTG TGG ■ 509 
Phe He Val Leu Cys Gly Ala Thr His Phe He Ser Leu Trp 

65 70 ^ 

IZ^ IZ"^ TCT AAG ACG GTC GCT GTG GTT ATG ACC 551 

Thr Phe Phe Met His Ser Lys Thr Val Ala Val Val Met Thr 
'5 80 85 

ATA TCA AAA ATG TTG ACA GCT GCC GTG TCC TGT ATC ACA GCT 593 
^SS Leu Thr Ala Ala Val Ser Cys He Thr Ala 

90 95 100 

TTG ATG CTT GTT CAC ATT ATT CCT GAT TTG CTA AGT GTT AAA 635 
Leu Met Leu Val His He He Pro Asp Leu Leu Ser Val Lys 
105 110 115 

ACG CGA GAG TTG TTC TTG AAA ACT CGA GCT GAA GAG CTT GAC 677 
Thr Arg Glu Leu Phe Leu Lys Thr Arg Ala Glu Glu Leu Asp 
120 125 135 

AAG GAA ATG GGC CTA ATA ATA AGA CAA GAA GAA ACT GGC AGA 719 
Lys Glu Met Gly Leu He He Arg Gin Glu Glu Thr Gly Arg 

135 140 

CAT GTC AGG ATG CTG ACT CAT GAG ATA AGA AGC ACA CTC GAC 761 
His Val Arg Met Leu Thr His Glu He Arg Ser Thr Leu Asp 
145 150 155 

AGA CAC ACA ATC TTG AAG ACT ACT CTT GTG GAG CTA GGT AGG 803 
Arg His Thr He Leu Lys Thr Thr Leu Val Glu Leu Gly Arg 
160 165 170 
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ACr TTA GAC CTG GCA GAA TGT GCT TTG TGG ATG CCA TGC CAA 845 
Th? lln itp lIS lli GlS Cys Ala Leu Trp Met Pro Cvs Gin 
175 180 

- ^rn^ aCT TTG CAA CTT TCC CAT AAT TTA AAC AAT CTA 887 

ily Gl5 leu Thr leu GlS lei Ser His Asn Leu Asn Asn Leu 
■' 195 •^"U 

ATA CCT CTG GGA TCT ACT GTG CCA ATT AAT CTT CCT ATT ATC 929 
lit Pro Leu Gly Ser Thr Val Pro He Asn Leu Pro He He 

205 210 

AAT GAA ATT TTT AGT AGC CCT GAA GCA ATA CAA ATT CCA CAT 971 
Asn Glu He Phe Ser Ser Pro Glu Ala He Gin He Pro His 
215 220 225 

ACA AAT CCT TTG GCA AGG ATG AGG AAT ACT GTT GGT AGA TAT 1013 
Thr A^n P?o Uu Alt X?g Met Arg Asn Thr Val Glv Arg Tyr 
230 235 '^^^ 

ATT CCA CCA GAA GTA GTT GCT GTT CGT GTA CCG CTT TTA CAC 1055 
He P?S P?o ilS vll vil Ala Val Arg Val Pro Leu Leu His 
245 250 ^^33 

CTC TCA AAT TTT ACT AAT GAC TGG GCT GAA CTG TCT ACT AGA 1097 
Leu Ser Asn Phe Thr Asn Asp Trp Ala Glu Leu Ser Thr Arg 
260 265 ^'^ 

AGT TAT GCG GTT ATG GTT CTG GTT CTC CCG ATG AAT GGC TTA 1139 
sfr Ty? AlS vll Me? Val Leu Val Leu Pro Met Asn Gly Leu 

275 280 

AGA AAG TGG CGT GAA CAT GAG TTA GAA CTT GTG CAA GTT GTC 1181 
Arg Lys Trp Arg Glu His Glu Leu Glu Leu Val Gin Val vai 
- 285 290 295 

GCA GAT CAG GTT GCT GTC GCT CTT TCA CAT GCT GCA ATT TTA 1223 
Alt A^p Gin vSl Ala Val Ala Leu Ser His Ala Ala He Leu 
300 305 310 

GAA GAT TCC ATG CGA GCC CAT GAT CAG CTC ATG GAA CAG AAT 1265 
GlS Sp ser Mit Ala His Asg Gin Leu Met Glu Gin Asn 

ATT GCT TTG GAT GTA GCT CGA CAA GAA GCA GAG ATG GCC ATC 1307 
111 All El A|g val Ala Arg Gin Glu Ala Glu Met Ala He 

p-™ GAC TTC CTT GCT GTG ATG AAC CAT GAA ATG 1349 

A^g Alt Sg AtS III Phe Leu Ala Val M|t Asn His Glu Met 



AGA ACG CCC ATG CAT GCA GTT ATT GCT CTG TGC TCT CTG CTT 1391 
Aft Th? p5S Mit Hii Alt val He Ala Leu Cys Ser Leu Leu 
355 360 

TTA GAA ACA GAC TTA ACT CCA GAG CAG AGA GTT ATG ATT GAG 1433 
Leu Glu Thr Asp Leu Thr Pro Glu Gin Arg Val Met lie t=iu 
370 375 •3"'" 
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ACC ATA TTG AAG AGC AGC AAT CTT CTT GCA ACA CTG ATA AAT 1475 
Thr lie Leu Lys Ser Ser Asn Leu Leu Ala Thr Leu lie Asn 
385 390 395 

GAT GTT CTA GAT CTT TCT AGA CTT GAA GAT GGT ATT CTT GAA 1517 
Asp Val Leu Asp Leu Ser Arg Leu Glu Asp Gly lie Leu Glu 
400 405 410 

CTA GAA AAC GGA ACA TTC AAT CTT CAT GGC ATC TTA AGA GAG 1559 
Leu Glu Asn Gly Thr Phe Asn Leu His Gly lie Leu Arg Glu 

415 420 

GCC GTT AAT TTG ATA AAG CCA ATT GCA TCT TTG AAG AAA TTA 1601 

Ala Val Asn Leu lie Lys Pro lie Ala Ser Leu Lys Lys Leu 
425 430 435 

TCT ATA ACT CTT GCT TTG GCT CTG GAT TTA CCT ATT CTT GCT 1643 

Ser lie Thr Leu Ala Leu Ala Leu Asp Leu Pro lie Leu Ala 
440 445 450 

GTG GGT GAT GCA AAA CGT CTT ATC CAA ACT CTC TTA AAC GTG 1685 
Val Gly Asp Ala Lys Arg Leu lie Gin Thr Leu Leu Asn Val 
455 460 465 

GTG GGA AAT GCT GTG AAG TTC ACT AAA GAA GGA CAT ATT TCA 1727 
Val Gly Asn Ala Val Lys Phe Thr Lys Glu Gly His lie Ser 
470 475 480 

ATT GAG GCT TCA GTT GCC AAA CCA GAG TAT GCG AGA GAT TGT 1769 
lie Glu Ala Ser Val Ala Lys Pro Glu Tyr Ala Arg Asp Cys 

485 490 

CAT CCT CCT GAA ATG TTC CCT ATG CCA AGT GAT GGC CAG TTT 1811 
His Pro Pro Glu Met Phe Pro Met Pro Ser Asp Gly Gin Phe 
495 500 505 

TAT TTG CGT GTC CAG GTT AGA GAT ACT GGG TGT GGA ATT AGC 1853 
Tyr Leu Arg Val Gin Val Arg Asp Thr Gly Cys Gly lie Ser 
510 515 520 

CCA CAA GAT ATA CCA CTA GTA TTC ACC AAA TTT GCA GAG TCA 1895 
Pro Gin Asp lie Pro Leu Val Phe Thr Lys Phe Ala Glu Ser 
525 530 535 



CGG CCT ACG TCA AAT CGA AGT ACT GGA GGG GAA GGT CTA GGG 1937 
Arg Pro Thr Ser Asn Arg Ser Thr Gly Gly Glu Gly Leu Gly 
540 545 550 



CTT GCC ATT TGG AGA CGA TTT ATT CAA CTT ATG AAA GGT AAC 1979 

Leu Ala lie Trp Arg Arg Phe lie Gin Leu Met Lys Gly Asn 
^ 555 560 - 

ATT TGG ATT GAG AGT GAG GGC CCT GGA AAG GGA ACC ACT GTC 2021 

He Trp He Glu Ser Glu Gly Pro Gly Lys Glv Thr Thr Val 
565 570 -'575 

ACG TTT GTA GTG AAA CTC GGA ATC TGT CAC CAT CCA AAT GCA 2063 

Thr Phe Val Val Lys Leu Gly He Cys His His Pro Asn Ala 
580 585 590 
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TTA CCT CTG CTA CCT ATG CCT CCC AGA GGC AGA TTG AAC AAA 2105 

Leu Pro Leu Leu Pro Met Pro Pro Arg Gly Arg Leu Asn Lvs 
595 600 605 

GGT AGC GAT GAT CTC TTC AGG TAT AGA CAG TTC CGT GGA GAT 2147 
Gly Ser Asp Asp Leu Phe Arg Tyr Arg Gin Phe Arg Gly Asp 
610 615 620 

GAT GGT GGG ATG TCT GTG AAT GCT CAA CGC TAT CAA AGA AGT 2189 
Asp Gly Gly Met Ser Val Asn Ala Gin Arg Tyr Gin Arg Ser 

625 630 

ATG TAA A TGACAAAAGG ACATTGGTGT GACAAAGAAC ATTAAATCAT 2236 

Met * 

635 

GACTAGTGAA TTTGAGATTT CTTCACTGTT CTGTACACTC CAAATGGCAC 2286 

AGTTTGTCTT GTAACTAACC TAATTCAATG CTCGTAAAGT GAGTACTGGA 2336 

GTATCTTGAA AATGTAACTA TCGAATTTAT ACATCGAGCT TTTGACAAAA 2386 
AAAAAAAAAA AAAAAAAAA 2405 
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